-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Making clearer what namespaces are public by use of underscores #14360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
To check the impact on git history I pushed a branch here: https://github.com/rgommers/scipy/blame/test-renaming-file/scipy/ndimage/. It has 3 new commits:
$ git blame -L30,+5 _filters.py
ca465a651f6 Lib/ndimage/Lib/filters.py (Ed Schofield 2006-03-18 13:52:58 +0000 30)
7e3cda70f54 scipy/ndimage/filters.py (Gregory R. Lee 2020-02-01 14:41:31 -0700 31) from collections.abc import Iterable
9ccbb172e6a scipy/ndimage/filters.py (Aditya Bharti 2018-03-12 03:14:07 +0530 32) import warnings
ee4db9c301d Lib/ndimage/filters.py (Robert Kern 2006-09-24 09:05:13 +0000 33) import numpy
69cfe48f37f scipy/ndimage/filters.py (Anju Geetha Nair 2018-11-04 01:14:40 -0800 34) import operator If you check the history of the file, you do see a change: That change is the main (only?) downside, but it's a minor one. Locally you can ignore that rename with $ gitk --follow _filters.py |
Thanks, @rgommers for throwing some light on the history of these private namespaces which look like public (without leading underscores). Also, thanks for verifying that renaming still keeps I'd also lean towards this approach, but I'll wait for more discussion (if any) on this.
Once we have decided, I can start with |
Message to the mailing list about this: https://mail.python.org/pipermail/scipy-dev/2021-July/024943.html |
I think I understand what you're proposing, seems to make sense. One of the drawbacks compared to aggressively private languages I suppose (Rust comes to mind--you need a |
I opened a PR with an example to make sure we're on the same page: gh-14419. |
I have updated the issue description with a status tracker. @Smit-create will start moving this forward, starting with |
@Smit-create @rgommers Which modules are left for deprecation? Please let me know, I will pick it up from there. Thanks. |
It's up to date in the OP, I will pick up |
I will pick |
Hi @h-vetinari, I left it intentionally because it only had private functions. See #15105 (comment) |
Fine by me, just wanted to make sure it was intentional and not an oversight. |
Private namespaces where renamed in scipy 1.8.0 to distinguish them from public ones: scipy/scipy#14360. We fixed the import statements appropriately. We used a try/except statement so that the toolbox still works with scipy < 1.8.0.
See - scipy/scipy#15220 - scipy/scipy#14360 Currently, it gives the following error message: ``` --------------------------------------------------------------------------- ImportError Traceback (most recent call last) Input In [29], in <module> ----> 1 from scipy.sparse.linalg import linsolve ImportError: cannot import name 'linsolve' from 'scipy.sparse.linalg' (/nfshome/basnijholt/mambaforge/envs/qms-lab-20220224-main/lib/python3.9/site-packages/scipy/sparse/linalg/__init__.py) ``` With this change, the following warning is emitted: ``` /tmp/scratch/basnijholt/ipykernel_59547/2794395227.py:1: DeprecationWarning: The module `scipy.sparse.linalg.dsolve.linsolve` is deprecated. All public names must be imported directly from the `scipy.sparse.linalg` namespace. from scipy.sparse.linalg.dsolve import linsolve ``` The following gives no warning: ``` from scipy.sparse.linalg._dsolve import linsolve ``` However, we should reconsider using private features.
I re-read this issue and was wondering when removal was envisaged. Turns out it's mentioned in the mailing list post that the removal is currently planned for scipy 2.0. I'll reflect it as such in the deprecation tracker. |
Were sub-sub-packages supposed to be included here? There are several that still expose their innards. from scipy.cluster.vq import xpx, np, check_random_state
from scipy.cluster.hierarchy import xp_capabilities, array_namespace, is_dask
from scipy.io.wavfile import warnings, struct, sys
from scipy.linalg.blas import functools
from scipy.linalg.lapack import np
from scipy.linalg.interpolative import np
from scipy.spatial.distance import np
from scipy.stats.contingency import np |
I noticed this the other day too when you posted prototype array api table. IMO it would be good to clean this up. |
Yes agreed, it should apply to all public API. Please don't modify years-old closed issues and remove the milestone though. I'll open a new issue for a follow-up. It should also be investigated if these subsubmodules are missing from the public API testing. |
Follow-up issue: gh-23033 |
Uh oh!
There was an error while loading. Please reload this page.
When we add new
.py
files, we are careful to add underscores to the file names to ensure everyone sees at a glance that this is not a new public namespace. However, we have lots of files that are missing that underscore. This is only because of historical reasons; in the early days of SciPy no one paid attention to this. We had a discussion about it once, probably about 9-10 years ago, and decided to not clean things up back then because the concrete benefits were unclear.The public namespaces are documented in http://scipy.github.io/devdocs/reference/index.html#api-definition. No one reads the docs though. This has become much more of a problem recently, because authors of other libraries have started to reimplement parts of the SciPy API, for example:
dask_image
, see http://image.dask.org/en/latest/api.htmltorch.special
namespace that mirrorsscipy.special
: https://pytorch.org/docs/master/special.htmlThose libraries can get it wrong by accident, or they just mirror what our actual filenames are even though we tell them it's not meant to be a public namespace. An example of each:
dask_image
restructuring its namespace to matchscipy.ndimage
scipy.stats.distributions
Long story short: we should clean this up. There's two ways we can go about this:
somefile.py
to_somefile.py
, and then add a newsomefile.py
which reimports the public functions from_somefile.py
and adds a big comment with warnings about this not being a public namespace at the top of the file.somefile
I'd suggest to do (2) by default, and only make exceptions in case something is quite heavily used and adding deprecations would be too disruptive.
We should prioritize modules that are being duplicated in CuPy, JAX, PyTorch et al.
EDIT: a test should also be added that no new private-but-public-looking namespaces are added. The approach can be taken from https://github.com/numpy/numpy/blob/main/numpy/tests/test_public_api.py
EDIT 2: a basic API design principle is: a public object should only be available from one namespace. Having any function in two or more places is just extra technical debt, and with things like dispatching on an API or another library implementing a mirror API, the cost goes up.
EDIT 3: now that it's decided we're going ahead with this, here is a tracker of which modules are done:
Also, we should take over the public API test from NumPy:
EDIT 4:
sorted([x for x in dir(fitpack) if not (x.startswith('_') or x in ('np', 'warnings'))]
to compare for each file/namespace we gave an underscore that everything that looks like a regular SciPy function/object is present in the listings from which we raise deprecation warnings. That way, even if people used clearly private things that were not in__all__
in that file, they will still get the warnings rather than a hard break.The text was updated successfully, but these errors were encountered: