API: Add device and to_device to numpy.ndarray [Array API]#25233
API: Add device and to_device to numpy.ndarray [Array API]#25233ngoldbaum merged 3 commits intonumpy:mainfrom
device and to_device to numpy.ndarray [Array API]#25233Conversation
device and to_device to numpy.ndarraydevice and to_device to numpy.ndarray [Array API]
4a36606 to
5037b48
Compare
5037b48 to
ebe3503
Compare
4e99405 to
89184df
Compare
|
Hi @seberg, |
|
I think there's enough code out there relying on the identity comparison fast path in There's a real test failure though: |
Right, I need to adjust the regex to conform to pypy's ndarray name. |
|
Now it's ready. |
|
Just for the record - it would be nice to see a little justification here or in the eventual NEP for why we need this, despite the only allowed options being I would also add to all the docstrings for |
94377a8 to
91b8d6f
Compare
device and to_device to numpy.ndarray [Array API]device and to_device to numpy.ndarray [Array API]
|
Hmm, looks like those note admonitions don't show up in the docstrings: https://output.circle-artifacts.com/output/job/ce6f8552-de08-47c3-80bb-91eb2ad750a5/artifacts/0/doc/build/html/reference/generated/numpy.empty.html#numpy.empty |
91b8d6f to
dc7848e
Compare
Ah, I missed one |
be8b328 to
5498f7f
Compare
In accordance with the array API. See numpy#25233.
mhvk
left a comment
There was a problem hiding this comment.
Comments mostly from the point of view that we should try to minimize any impact from an argument that for numpy is not useful. So I'd put device after like, and would try to minimize extra lines added and impact on code by passing it on as much as possible, so that there is only one place where the error gets raised (in the C code parsing device).
| add_newdoc('numpy._core.multiarray', 'asarray', | ||
| """ | ||
| asarray(a, dtype=None, order=None, *, like=None) | ||
| asarray(a, dtype=None, order=None, *, device=None, like=None) |
There was a problem hiding this comment.
Here and below, why is device ahead of like? Since one cannot use it for anything useful, my tendency would be to put it at the end.
There was a problem hiding this comment.
Also, why not just set the default to "cpu"? That seems clearer. (Still fine to accept None).
There was a problem hiding this comment.
Right now the function dispatch requires like to be the last argument - here's a check that raises RuntimeError if a function doesn't comply with it:
numpy/numpy/_core/overrides.py
Lines 149 to 156 in 448e4da
Can I keep device before like or should we update the __array_function__ protocol?
There was a problem hiding this comment.
Also, why not just set the default to
"cpu"? That seems clearer. (Still fine to acceptNone).
That's true that "cpu" is the only supported value, but I would prefer to stick to the Array API standard here. It defines the default value for device as None: https://data-apis.org/array-api/latest/API_specification/generated/array_api.asarray.html
We're adding these keyword arguments for Array API compatibility only, so I would prefer to fully conform with it.
There was a problem hiding this comment.
I don't think it matters from the Array API perspective, it doesn't make NumPy non-compliant, it just makes it explicit about what that default is.
But, I'll give a very slight vote to just keep it None anyway. Just becuase I don't think the more explicit device="cpu" makes it clearer, since users should ignore it anyway (you only use it via xp = arr.__array_namespace__() as an Array API user, never directly as a NumPy user).
There was a problem hiding this comment.
The order doesn't matter much in principle, since it's keyword-only and like is only for other non-numpy libraries as well. However, there's a bigger problem here:
>>> import numpy as np
>>> import dask.array as da
>>> d = da.ones((1, 2))
>>> np.asarray([1, 2], device='cpu')
array([1, 2])
>>> np.asarray([1, 2], device='cpu', like=d)
Traceback (most recent call last):
Cell In[24], line 1
np.asarray([1, 2], device='cpu', like=d)
File ~/code/tmp/dask/dask/array/core.py:1760 in __array_function__
return da_func(*args, **kwargs)
File ~/code/tmp/dask/dask/array/core.py:4581 in asarray
return from_array(a, getitem=getter_inline, **kwargs)
TypeError: from_array() got an unexpected keyword argument 'device'The device keyword probably should not be added to the dispatchers for any of these creation functions. It looks like the __array_function__ protocol wasn't implemented in a forward-looking way - I haven't looked much deeper yet, but at first sight it may be the case that we can never add a new keyword to dispatched functions anymore.
In this case, device is here to support writing cross-library-compatible code via the array API standard. The dispatching protocols kinda were earlier attempts to do something similar. So probably they should not interact anyway?
There was a problem hiding this comment.
Ah wait, it's maybe not that extreme - np.asarray([1, 2], like=d) still works, so new keywords are fine, as long as they are not used (then they don't get forwarded). Of course that's a bit funky, but there's no way to avoid it really.
>>> np.asarray([1, 2], device='cpu')
array([1, 2])
>>> np.asarray([1, 2], like=d)
dask.array<array, shape=(2,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> np.asarray([1, 2], like=d, device='cpu')
...
TypeError: from_array() got an unexpected keyword argument 'device'This would have been much more problematic if libraries like SciPy and scikit-learn supported non-numpy array inputs via __array_ufunc/function__. But since they squash such inputs anyway with np.asarray, there shouldn't be too much of an issue in practice.
And it's fairly easy to add support for new keywords in Dask & co. They just need a new 2.0-compatible release that adds such support, and then device/like can be used together.
There was a problem hiding this comment.
@rgommers - I've gotten quite used to updating astropy's __array_function__ wrappers to remain compatible with numpy functions -- and we've got tests to check that our signatures remain consistent. Of course it doesn't happen that often. So, I don't think that's a worry, and overall I'd tend to not special-case any arguments (i.e., pass it on if present).
On the order: if like has to be last, then just keep it like that, sorry for not noticing myself.
There was a problem hiding this comment.
Just to clarify: the check is only there to test habits, and probably mainly to make sure it is kwarg only.
It may have to be last in C (not sure), but it doesn't really matter anywhere here. But... it also doesn't matter a lot (especially since if someone cares, they can change it).
There was a problem hiding this comment.
Thanks @mhvk, that is useful context. A test that flag mismatches sounds like a great idea. 🤞🏼 most implementers have that.
Co-authored-by: Aaron Meurer <asmeurer@gmail.com>
|
Merging this one as well since all comments have been addressed. Thanks @mtsokol! |
Hi @rgommers @seberg,
This PR introduces
deviceandto_devicemembers tonumpy.ndarray.Additionally,
devicekeyword-only arguments were added to:numpy.asarray,numpy.arange,numpy.empty,numpy.empty_like,numpy.eye,numpy.full,numpy.full_like,numpy.linspace,numpy.ones,numpy.ones_like,numpy.zeros, andnumpy.zeros_like.