BUG: Fix memory leak of buffer-info cache due to relaxed strides by seberg · Pull Request #16936 · numpy/numpy

seberg · 2020-07-23T16:45:30Z

When relaxed strides is active (and has an effect), we recalculate
the strides to export "clean" strides in the buffer interface.
(Python and probably some other exporters expect this, i.e. NumPy
has fully switched to and embraced relaxed strides, but the buffer
interface at large probably not.)

The place where "fixing" the strides occured however meant that
when the strides are fixed, the old, cached buffer-info was not
reused when it should have been reused.

This moves the "fixing" logic so that reuse will occur. It leaves
one issue open in that an array shaped e.g. (1, 10) is both
C- and F-contiguous. Thus, if it is exported as C-contiguous and
then as F-contiguous, and then again as C-contiguous, this will
work, but the last export will compare to the export as an F-contig
buffer and thus still leak memory.

Address gh-16934 (but does leave a small hole)

@charris marking this as backport candidate. But, that code got some small changes, so it may be the diff does not apply (although I think it should).

seberg · 2020-07-23T16:46:47Z

numpy/core/tests/test_multiarray.py

I have confirmed that this slightly strange modification of the test does cause it to fail when using pytest-leaks.

EDIT: Not with this fix of course.

numpy/core/src/multiarray/buffer.c

eric-wieser · 2020-07-23T19:55:36Z

numpy/core/src/multiarray/buffer.c

Something for a follow-up - this is overly strict, if the user did not request a contiguous buffer at all we still construct these replacement strides.

IIRC there was a reason for that. Maybe Cython may sometimes requested a buffer without specifying the flag, but then checking for it anyway later-on (potentially even raising an error). I somewhat expects cython caught up over time with our interpretation of relaxed strides, but I am not 100% sure.

peterbell10 · 2020-07-23T23:58:17Z

numpy/core/src/multiarray/buffer.c

If a new buffer info is always constructed anyway, what's the benefit of having a cache at all? Also, it's a list but I only ever see the last item in the list referenced. Couldn't it be replaced with a single object per array? Then there is no memory growth at all.

Yeah, it would be nice to skip part of that sometimes... and would be a small speed enhancement. But, we have to reuse the old one, so that you do not leak memory and we need to hold on to it to be able to free it again.

The problem is that we can only free these when the array is deleted, which is the reason for your memory leak. This is a fundamental issue around the buffer protocol, which actually does also provide the answer to the issue (by providing a free function). But due to some backcompat issues, we can't just use that...

Times change, and the need for that backcompat probably goes down. Maybe one more thing that would be nice to try out if we had a major release, I doubt that it would create a big issue in most cases.

Sorry, the important part is that, even though nobody likes that, arrays are currently mutable both in shape/strides and dtype... So we can't just stick to a single exported buffer object :(.

So to sum up, it isn't a cache in the usual sense but instead a store of all buffer objects.

I see this was 577dbbd which is 11 years old. If the fix 11 years old, perhaps that PyArgs_ParseTuple issue has been fixed in the mean time?

No, that issue is still in PyArg_ParseTuple with an ugly comment as well there. It may be that users of it are less now though. And yes, you are right, the name "cache" is a misnomer, since it is persistent.

numpy/core/src/multiarray/buffer.c

eric-wieser · 2020-08-13T09:00:56Z

numpy/core/src/multiarray/buffer.c

This seems very easy to fix - just make key the tuple PyTuple_Pack(2, PyLong_FromVoidPtr((void*)obj), PyBool_FromLong(f_contiguous)) (with error checking).

Hmmm, it does mean you have to delete two elements on every single array deallocation, though.

So it trades of something that should never happen, with a small speed penalty on every deallocation right now.

Sorry, I guess that is not true, it might happen, but its unlikely common. The other alternative is to always probe 2 buffers into the past for "reuse", and actually, I guess only when the array is both C- and F-contiguous, which is quick to check.

If you think using a (pointer, order) key is two expensive, then probably best to add that justification to a source comment.

I think the issue was that I had neither of those ideas when writing that comment. And I do hate 98% solutions. I will have a quick look whether inspecting two list elements is easy, if not, I will expand the comment.

OK, I pushed a fix, too often just staring at it and wondering takes more time than the fix :(. I have added a test, which trips if run with pytest-leaks. I have confirmed that it fails before and passes after.

eric-wieser · 2020-08-13T09:03:08Z

numpy/core/src/multiarray/buffer.c

As an aside for a future PR, I think we could get a reasonable speedup for the cache hit case by using Py_capsure objects instead of PyLong_AsVoidPtr here - converting pointers to big integers and back seems silly when we could just keep the pointer around.

Possibly (and probably nicer in any case). I would prefer the approach in: gh-16938, but I don't have the mental capacity to figure out how annoying growing the struct is (I doubt its very) and mainly what could be done to reduce the chance of issues.

charris · 2020-09-02T16:13:42Z

@seberg @eric-wieser There is a push to release a 1.19.2, is this PR ready?

seberg · 2020-09-04T13:01:50Z

Hmm, yeah, I think it is, I am not sure the last commit was ever reviewed though.

charris · 2020-09-08T16:29:47Z

Looks like this is probably ready, but I'm going to push off to 1.19.3 for backport. We will need to make that release for python 3.9 and (hopefully) the Windows 10 version 2004 fix.

seberg · 2020-09-11T15:07:54Z

I pushed a tiny bit of code cleanup in preparation of a fix for gh-17294.

eric-wieser · 2020-09-29T20:43:37Z

numpy/core/tests/test_multiarray.py

why does this fail without relaxed strides?

The array will be only C contiguous, so you can't request an f-contiguous buffer.

Makes sense, thanks. Am I right in thinking code-coverage skips this test, based on the review comment it made in the diff?

That seems strange, unfortunately the comment is gone now due to the small comment change, so have to see which branch is uncovered and whether it should be later.

Maybe the uncovered pass was within the else block of #if NPY_RELAXED_STRIDES_CHECKING? That can't be hit, since codecov should be running in the default setup.

Well, the codecoverage didn't lie, the code got messed up in a cleanup try... The test was missing the strides check to notice it :(.

anyway, fixed now...

numpy/core/src/multiarray/buffer.c

eric-wieser

Sorry for the delay reviewing this - I think the code looks good, just want to check I understand the test

charris · 2020-10-19T23:31:34Z

What is the status of this? The CircleCI failures are fixed and master and can be ignored.

When relaxed strides is active (and has an effect), we recalculate the strides to export "clean" strides in the buffer interface. (Python and probably some other exporters expect this, i.e. NumPy has fully switched to and embraced relaxed strides, but the buffer interface at large probably not.) The place where "fixing" the strides occured however meant that when the strides are fixed, the old, cached buffer-info was not reused when it should have been reused. This moves the "fixing" logic so that reuse will occur. It leaves one issue open in that an array shaped e.g. `(1, 10)` is both C- and F-contiguous. Thus, if it is exported as C-contiguous and then as F-contiguous, and then *again* as C-contiguous, this will work, but the last export will compare to the export as an F-contig buffer and thus still leak memory. Address numpygh-16934 (but does leave a small hole)

Exporting these multiple times alternating would previously cause a new buffer-info to be created each time.

…trides :(

mattip · 2020-10-22T05:54:06Z

Thanks @seberg

seberg added 00 - Bug component: numpy._core 09 - Backport-Candidate PRs tagged should be backported labels Jul 23, 2020

seberg force-pushed the issue-16934 branch from 15f4fc8 to 2b09f66 Compare July 23, 2020 16:46

seberg commented Jul 23, 2020

View reviewed changes

eric-wieser reviewed Jul 23, 2020

View reviewed changes

numpy/core/src/multiarray/buffer.c Outdated Show resolved Hide resolved

seberg force-pushed the issue-16934 branch from 2b09f66 to 59918a3 Compare July 23, 2020 18:06

eric-wieser reviewed Jul 23, 2020

View reviewed changes

peterbell10 reviewed Jul 23, 2020

View reviewed changes

eric-wieser reviewed Jul 24, 2020

View reviewed changes

numpy/core/src/multiarray/buffer.c Outdated Show resolved Hide resolved

mattip requested a review from eric-wieser August 13, 2020 08:36

eric-wieser reviewed Aug 13, 2020

View reviewed changes

seberg force-pushed the issue-16934 branch from 933d0d2 to 0518d92 Compare August 13, 2020 20:33

seberg added the 58 - Ready for review label Aug 24, 2020

Patol75 mentioned this pull request Sep 4, 2020

[core] Potential memory leak with ActorPool ray-project/ray#10487

Closed

charris added this to the 1.19.3 release milestone Sep 8, 2020

seberg force-pushed the issue-16934 branch from 47992a3 to 971b5ac Compare September 11, 2020 16:28

seberg mentioned this pull request Sep 11, 2020

BUG,ENH: fix pickling user-scalars by allowing non-format buffer export #17295

Merged

eric-wieser reviewed Sep 29, 2020

View reviewed changes

numpy/core/src/multiarray/buffer.c Outdated Show resolved Hide resolved

eric-wieser reviewed Sep 29, 2020

View reviewed changes

seberg added 3 commits October 19, 2020 18:33

BUG: Fix leak for relaxed strides when exporting both C- and F-order

7808036

Exporting these multiple times alternating would previously cause a new buffer-info to be created each time.

MAINT: simplify buffer info reuse check

37c2f60

seberg force-pushed the issue-16934 branch from 99fcb46 to 37c2f60 Compare October 19, 2020 23:33

Fixup, an earlier cleanup got things wrong and I forgot to test the s…

c42e690

…trides :(

seberg mentioned this pull request Oct 22, 2020

ENH,API: Store exported buffer info on the array #16938

Merged

mattip merged commit 84626f2 into numpy:master Oct 22, 2020

charris removed the 58 - Ready for review label Oct 27, 2020

charris mentioned this pull request Oct 27, 2020

BUG: Fix memory leak of buffer-info cache due to relaxed strides #17647

Merged

charris removed the 09 - Backport-Candidate PRs tagged should be backported label Oct 27, 2020

charris removed this from the 1.19.3 release milestone Oct 27, 2020

Uh oh!

Conversation

seberg commented Jul 23, 2020

Uh oh!

seberg Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wieser Aug 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charris commented Sep 2, 2020

Uh oh!

seberg commented Sep 4, 2020

Uh oh!

charris commented Sep 8, 2020

Uh oh!

seberg commented Sep 11, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eric-wieser left a comment

Choose a reason for hiding this comment

Uh oh!

charris commented Oct 19, 2020

Uh oh!

mattip commented Oct 22, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

seberg Jul 23, 2020 •

edited

Loading

eric-wieser Aug 13, 2020 •

edited

Loading