DOC: add docstrings for numeric types#11858
Conversation
|
Test failure is bogus. |
jeffyancey
left a comment
There was a problem hiding this comment.
this looks fine to me.
97ac6b4 to
5c6c027
Compare
seberg
left a comment
There was a problem hiding this comment.
Just comments, did not read all, but I think it should all be good even without going into those, just bous questions.
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
Wondering a bit about the Python int compatible part here. It is mostly valid for python2, and even then wrong on windows (and 32bit linux).
There was a problem hiding this comment.
Yeah, this is a) not true on 32-bit windows, and b) doesn't set __doc__ correctly anyway
There was a problem hiding this comment.
np.dtype('int64').__doc__ is empty, but np.int64.__doc__ shows this string.
There was a problem hiding this comment.
My mistake, I was misinterpreting the results of #10106
There was a problem hiding this comment.
I mainly imitated the existing documentation for all different int sizes to uint sizes. But yes, there's this aliasing going on between e.g., np.int32 is np.intc is true on my machine. So it's actually not straightforward to document this, since documenting np.intc will then also document np.int32.
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
c long float (or long double?). Wonder if it should warn here that it is not an IEEE quad float type. And the same text is true for float96, also... But maybe this is not the place to get into those details. Not sure how prominent this shows up also on the online docs.
There was a problem hiding this comment.
Euhm, yes, that's a mistake, sorry. I'll fix this and add some extra notes on the 96-bits version, as well.
|
I think we need to tread carefully here - we might end up wanting different online docs to the local docs - local docs know the exact mapping of sized aliases to C types, whereas the sphinx docs shouldn't make assumptions about the user's machine. |
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
Not true - on my machine, the character code is l.
If we want to add these docs, I think we should do it to byte/short/intc/int_/longlong, not the sized aliases
OK, this is an important point for good documentation. I will take a second stab, keeping this in mind. To get a better picture: are these docstrings meant to also replace the list of scalar dtypes in the docs on the long run (i.e., with |
|
Perhaps the correct path for now is:
|
9b9b2f2 to
05c601c
Compare
|
I'm not completely happy with how this sketch of a solution looks, but it should be a step closer to the actual correct, yet platform-flexible documentation. Is this where this PR wants to go? |
numpy/core/_add_newdocs.py
Outdated
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
Changed size_t and ssize_t to intptr_t and uintptr_t, but the reference docs are again slightly wrong, here, then: https://docs.scipy.org/doc/numpy/user/basics.types.html
Should these be fixed as well?
|
I like the direction this is going in - I think it will definitely be an improvement over what already exists. I think we'll need to think a little more about how to use this with server-built sphinx builds - but for local users using Right now none of these docstrings are in sphinx anyway, so we can punt that to a later PR. |
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
Strictly this should be double - float_ is an alias defined to be compatible with python float, although admittedly the two are always the same.
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
I think longdouble is the canonical name.
There was a problem hiding this comment.
OK, just followed the list here: https://docs.scipy.org/doc/numpy/reference/arrays.scalars.html. But indeed, longfloat seems to be an alias for longdouble.
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
This one is an alias, not a real type
There was a problem hiding this comment.
Does that mean that float96is also an alias, and either float96 or float128 is matched against longdouble?
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
Rather than using a hack to make a local variable in a comprehension, I think this would be clearer as a generator using yield
|
Update: fixed minor things, but:
But these can probably be fixed once these docstrings get included in the reference documentation? Next, stackoverflow shows a few hackish way of detecting sphinx (https://stackoverflow.com/questions/20843737/check-if-sphinx-doc-called-the-script), but something feels wrong about actually accessing and using this when generating the docs? |
|
Lets leave worrying about sphinx to a later PR. Both your points are correct - but I think the other docs are just subtly wrong, and I would prefer to aim for correctness not consistency - your most recent update looks good |
There was a problem hiding this comment.
Can you include both the name and doc in the list of aliases? Right now I think you only add the doc.
Also, adding float_ and complex_ to the list of aliases would be handy.
Can you show the output of help(np.intc), help(np.long), and help(np.longlong) after this change?
I suspect there's some overlap here with #10151
Yes, forgot that, thanks!
Since these seem to be 'hardcoded' aliases, I'm adding these as part of the docstring itself, and not with this alias list construct. |
Fine by me. |
>>> print(np.intc.__doc__)
Signed integer type, compatible with C ``int``.
Character code: ``'i'``.
>>> print(np.long.__doc__)
int(x=0) -> integer
int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments
are given. If x is a number, return x.__int__(). For floating point
numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in the
given base. The literal can be preceded by '+' or '-' and be surrounded
by whitespace. The base defaults to 10. Valid bases are 0 and 2-36.
Base 0 means to interpret the base from the string as an integer literal.
>>> int('0b100', base=0)
4
>>> print(np.longlong.__doc__)
Signed integer type, compatible with C ``long long``.
Character code: ``'q'``.I'll have a closer look at that other PR later. |
|
I'd specifically like to see the output of |
numpy/core/_add_newdocs.py
Outdated
There was a problem hiding this comment.
I think this should be AttributeError.
There was a problem hiding this comment.
Stolen from add_newdoc in core/function_base.py, where it also says Exception, but I've adapted it.
Sorry, here you go:
|
eric-wieser
left a comment
There was a problem hiding this comment.
Can't comment inline on mobile: Line 8017 should not be passing a default to getattr - the attribute will always exist for canonical names
numpy/core/_add_newdocs.py
Outdated
| def add_newdoc_for_numeric_type(obj, fixed_aliases, possible_aliases, doc): | ||
| o = getattr(_numerictypes, obj, None) | ||
| if o is None: | ||
| return |
There was a problem hiding this comment.
From my earlier comment - change this to o = getattr(_numerictypes, obj), since the object is guaranteed to exist.
| for (alias, doc) in aliases: | ||
| alias_type = getattr(_numerictypes, alias, None) | ||
| if alias_type is not None: | ||
| yield (alias_type, alias, doc) |
There was a problem hiding this comment.
Strictly speaking this would be better as:
try:
alias_type = getattr(_numerictypes, alias)
except AttributeError:
pass
else:
yield (alias_type, alias, doc)
As that doesn't silence bugs if an alias ends up somehow being set to None
numpy/core/_add_newdocs.py
Outdated
|
|
||
| add_newdoc('numpy.core.numerictypes', 'object_', | ||
| """Any Python object. Character code: 'O'.""") | ||
| add_newdoc_for_numeric_type('object_', [], [], |
There was a problem hiding this comment.
Not sure I'd consider this numeric. Probably better either to rename the function to *_scalar_type, or leave the non-numeric types as they were. I realize the module is called numerictypes, but we're stuck with that for now.
eric-wieser
left a comment
There was a problem hiding this comment.
Made the changes I suggested below myself. Feel free to tweak if you want, but I think this is ready to go in
I've still implemented the one small change in |
|
Oh, and almost forgot, but I yesterday noticed that this actually doesn't work for the complex types. These types already have a docstring set in the C code, and >>> help(np.cdouble)
Help on class complex128 in module numpy:
class complex128(complexfloating, builtins.complex)
| Composed of two 64 bit floatsI guess I've missed this since been focusing on the signed and unsigned integers, since they had these C type equivalence issues. The line with this docstring is here: https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/scalartypes.c.src#L3773. My instinctive reaction would be to make this consistent and remove the docstring from the C code, but then I don't know the reason why these docstrings are there? |
Seems sensible to me. Even if there is a reason for them to be there, it would apply to all of the docstrings anyway, not just that one, |
| ('complex128', 'Complex number type composed of 2 64-bit-precision floating-point numbers'), | ||
| ('complex192', 'Complex number type composed of 2 96-bit extended-precision floating-point numbers'), | ||
| ('complex256', 'Complex number type composed of 2 128-bit extended-precision floating-point numbers'), | ||
| ]) |
There was a problem hiding this comment.
Is there any real value to having these four separate lists of aliases, rather than building one big list to perform the lookup in?
There was a problem hiding this comment.
Euhm, yeah, the idea was not not loop over irrelevant aliases (while. But that does seem to come at the cost of being slightly more error-prone. I'm guessing you prefer the side of the trade-off that's less error-prone?
There was a problem hiding this comment.
Yeah, especially since collectivey we've proven that that type of error is easy to make and hard to spot. Thanks!
There was a problem hiding this comment.
One less argument to add_newdoc_for_scalar_type as well, as bonus.
…ultiarray/scalartypes.c.src, as they are now set in numpy/core/_add_newdocs.py
…ing 'np.' to fixed aliases
| try: | ||
| alias_type = getattr(_numerictypes, alias) | ||
| except AttributeError: | ||
| pass |
There was a problem hiding this comment.
Could do with a comment here like "the set of aliases that actually exist varies between platforms" or "this alias is not present on this platform" or something
eric-wieser
left a comment
There was a problem hiding this comment.
If you want more visibility for this, a release note under "improvements" mentioning that help(np.intp) or similar now shows a list of common type aliases would seem pretty sensible
c47a0a7 to
dff5de6
Compare
|
Almost done! :-) Running the code from #10106 again, there are a few more undocumented numeric types that seem to be related, part of the type hierarchy of scalar types:
Futhermore, |
|
Perhaps let this go in as-is, and add the rest in another PR? |
|
I'm with @mattip on this one - documenting the abstract types seems like a separate task to documenting the concrete ones. |
|
Just tweaked the release notes - I plan to squash and merge in a day or two, in case anyone else decides to weigh in. |
|
Thanks @YannickJadoul. And thanks to Eric for helping get this knocked into shape. |
|
Thanks indeed, @eric-wieser, for the guidance and remarks! |
This PR adds docstrings for
np.float16,np.uint8,np.uint16,np.uint32,np.uint64. Furthermore, the formatting of the char codes in the existing docstrings is unified to the same format.Where to generate the autosummary in sphinx, adding it to the reference documentation is not completely clear, though: the current documentation on dtypes is spread out over 2 or 3 different places:
See #10106.
(Part of EuroSciPy 2018 sprints)