8000 Making clearer what namespaces are public by use of underscores · Issue #14360 · scipy/scipy · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Making clearer what namespaces are public by use of underscores #14360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
21 tasks done
rgommers opened this issue Jul 6, 2021 · 43 comments
Closed
21 tasks done

Making clearer what namespaces are public by use of underscores #14360

rgommers opened this issue Jul 6, 2021 · 43 comments
Labels
maintenance Items related to regular maintenance tasks
Milestone

Comments

@rgommers
Copy link
Member
rgommers commented Jul 6, 2021

When we add new .py files, we are careful to add underscores to the file names to ensure everyone sees at a glance that this is not a new public namespace. However, we have lots of files that are missing that underscore. This is only because of historical reasons; in the early days of SciPy no one paid attention to this. We had a discussion about it once, probably about 9-10 years ago, and decided to not clean things up back then because the concrete benefits were unclear.

The public namespaces are documented in http://scipy.github.io/devdocs/reference/index.html#api-definition. No one reads the docs though. This has become much more of a problem recently, because authors of other libraries have started to reimplement parts of the SciPy API, for example:

Those libraries can get it wrong by accident, or they just mirror what our actual filenames are even though we tell them it's not meant to be a public namespace. An example of each:

Long story short: we should clean this up. There's two ways we can go about this:

  1. don't deprecate anything. Just move somefile.py to _somefile.py, and then add a new somefile.py which reimports the public functions from _somefile.py and adds a big comment with warnings about this not being a public namespace at the top of the file.
  2. do (1), and then deprecate the functions in somefile

I'd suggest to do (2) by default, and only make exceptions in case something is quite heavily used and adding deprecations would be too disruptive.

We should prioritize modules that are being duplicated in CuPy, JAX, PyTorch et al.

EDIT: a test should also be added that no new private-but-public-looking namespaces are added. The approach can be taken from https://github.com/numpy/numpy/blob/main/numpy/tests/test_public_api.py

EDIT 2: a basic API design principle is: a public object should only be available from one namespace. Having any function in two or more places is just extra technical debt, and with things like dispatching on an API or another library implementing a mirror API, the cost goes up.

>>> from scipy import ndimage
>>> ndimage.filters.gaussian_filter is ndimage.gaussian_filter  # :(
True

EDIT 3: now that it's decided we're going ahead with this, here is a tracker of which modules are done:

  • cluster
  • constants
  • fft
  • fftpack
  • integrate
  • interpolate
  • io
  • linalg
  • misc
  • ndimage
  • odr
  • optimize
  • signal
  • sparse
  • sparse.csgraph
  • sparse.linalg
  • spatial
  • special
  • stats

Also, we should take over the public API test from NumPy:

EDIT 4:

  • right before closing this issue, we should compare master with 1.7.0 and use some version of sorted([x for x in dir(fitpack) if not (x.startswith('_') or x in ('np', 'warnings'))] to compare for each file/namespace we gave an underscore that everything that looks like a regular SciPy function/object is present in the listings from which we raise deprecation warnings. That way, even if people used clearly private things that were not in __all__ in that file, they will still get the warnings rather than a hard break.
@rgommers rgommers added the maintenance Items related to regular maintenance tasks label Jul 6, 2021
@rgommers
Copy link
Member Author
rgommers commented Jul 6, 2021

To check the impact on git history I pushed a branch here: https://github.com/rgommers/scipy/blame/test-renaming-file/scipy/ndimage/. It has 3 new commits:

  1. rename ndimage/filters.py to _filters.py
  2. add a new file filters.py
  3. make a random change in _filters.py

git blame is not affected at all, it does not show the rename on the command line nor in the GitHub UI:

$ git blame -L30,+5 _filters.py
ca465a651f6 Lib/ndimage/Lib/filters.py (Ed Schofield     2006-03-18 13:52:58 +0000 30) 
7e3cda70f54 scipy/ndimage/filters.py   (Gregory R. Lee   2020-02-01 14:41:31 -0700 31) from collections.abc import Iterable
9ccbb172e6a scipy/ndimage/filters.py   (Aditya Bharti    2018-03-12 03:14:07 +0530 32) import warnings
ee4db9c301d Lib/ndimage/filters.py     (Robert Kern      2006-09-24 09:05:13 +0000 33) import numpy
69cfe48f37f scipy/ndimage/filters.py   (Anju Geetha Nair 2018-11-04 01:14:40 -0800 34) import operator

image

If you check the history of the file, you do see a change:

image

That change is the main (only?) downside, but it's a minor one. Locally you can ignore that rename with --follow, e.g.:

$ gitk --follow _filters.py

@AnirudhDagar
Copy link
Member

Thanks, @rgommers for throwing some light on the history of these private namespaces which look like public (without leading underscores). Also, thanks for verifying that renaming still keeps git blame intact.

I'd also lean towards this approach, but I'll wait for more discussion (if any) on this.

do (1), and then deprecate the functions in somefile

Once we have decided, I can start with scipy.ndimage.<module> -> scipy.ndimage._<module>.

@rgommers
Copy link
Member Author

Message to the mailing list about this: https://mail.python.org/pipermail/scipy-dev/2021-July/024943.html

@tylerjereddy
Copy link
Contributor

I think I understand what you're proposing, seems to make sense. One of the drawbacks compared to aggressively private languages I suppose (Rust comes to mind--you need a pub in front of pretty much anything you want public).

@rgommers
Copy link
Member Author

I think I understand what you're proposing, seems to make sense.

I opened a PR with an example to make sure we're on the same page: gh-14419.

8000

@rgommers
Copy link
Member Author

I have updated the issue description with a status tracker. @Smit-create will start moving this forward, starting with odr, linalg, special and stats.

@czgdp1807
Copy link
Member

@Smit-create @rgommers Which modules are left for deprecation? Please let me know, I will pick it up from there. Thanks.

@Smit-create
Copy link
Member

Which modules are left for deprecation?

It's up to date in the OP, I will pick up integrate next.

@czgdp1807
Copy link
Member

I will pick interpolate then.

@rgommers rgommers added this to the 1.8.0 milestone Oct 25, 2021
@h-vetinari
Copy link
Member

Just noticed while testing that scipy.signal.sigtools now is missing without a deprecation warning. This might be intentional, and I'm fine with it, but given that they were mentioned above by @rgommers as being supposed to warn, I thought I'd mention it.

@Smit-create
Copy link
Member

Hi @h-vetinari, I left it intentionally because it only had private functions. See #15105 (comment)

@h-vetinari
Copy link
Member

Hi @h-vetinari, I left it intentionally because it only had private functions. See #15105 (comment)

Fine by me, just wanted to make sure it was intentional and not an oversight.

bdice added a commit to glotzerlab/freud that referenced this issue Feb 6, 2022
VinceBaz added a commit to VinceBaz/neuromaps that referenced this issue Feb 8, 2022
Private namespaces where renamed in scipy 1.8.0 to distinguish them from
public ones: scipy/scipy#14360. We fixed the
import statements appropriately. We used a try/except statement so that
the toolbox still works with scipy < 1.8.0.
kwant-bot pushed a commit to kwant-project/kwant that referenced this issue Mar 30, 2022
See
- scipy/scipy#15220
- scipy/scipy#14360


Currently, it gives the following error message:
```
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Input In [29], in <module>
----> 1 from scipy.sparse.linalg import linsolve

ImportError: cannot import name 'linsolve' from 'scipy.sparse.linalg' (/nfshome/basnijholt/mambaforge/envs/qms-lab-20220224-main/lib/python3.9/site-packages/scipy/sparse/linalg/__init__.py)
```

With this change, the following warning is emitted:
```
/tmp/scratch/basnijholt/ipykernel_59547/2794395227.py:1: DeprecationWarning: The module `scipy.sparse.linalg.dsolve.linsolve` is deprecated. All public names must be imported directly from the `scipy.sparse.linalg` namespace.
  from scipy.sparse.linalg.dsolve import linsolve
```

The following gives no warning:
```
from scipy.sparse.linalg._dsolve import linsolve
```

However, we should reconsider using private features.
@h-vetinari
Copy link
Member

I re-read this issue and was wondering when removal was envisaged. Turns out it's mentioned in the mailing list post that the removal is currently planned for scipy 2.0. I'll reflect it as such in the deprecation tracker.

@mdhaber
Copy link
Contributor
mdhaber commented May 21, 2025

Were sub-sub-packages supposed to be included here? There are several that still expose their innards.

from scipy.cluster.vq import xpx, np, check_random_state
from scipy.cluster.hierarchy import xp_capabilities, array_namespace, is_dask
from scipy.io.wavfile import warnings, struct, sys
from scipy.linalg.blas import functools
from scipy.linalg.lapack import np
from scipy.linalg.interpolative import np
from scipy.spatial.distance import np
from scipy.stats.contingency import np

@lucascolley lucascolley removed this from the 1.8.0 milestone May 21, 2025
@j-bowhay
Copy link
Member

Were sub-sub-packages supposed to be included here? There are several that still expose their innards.

from scipy.cluster.vq import xpx, np, check_random_state
from scipy.cluster.hierarchy import xp_capabilities, array_namespace, is_dask
from scipy.io.wavfile import warnings, struct, sys
from scipy.linalg.blas import functools
from scipy.linalg.lapack import np
from scipy.linalg.interpolative import np
from scipy.spatial.distance import np
from scipy.stats.contingency import np

I noticed this the other day too when you posted prototype array api table. IMO it would be good to clean this up.

@rgommers rgommers added this to the 1.8.0 milestone May 22, 2025
@rgommers
Copy link
Member Author

Yes agreed, it should apply to all public API. Please don't modify years-old closed issues and remove the milestone though. I'll open a new issue for a follow-up. It should also be investigated if these subsubmodules are missing from the public API testing.

@rgommers
Copy link
Member Author

Follow-up issue: gh-23033

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks
Projects
None yet
Development

No branches or pull requests

12 participants
0