Skip to content

Conversation

@pablogsal
Copy link
Member

@pablogsal pablogsal commented Oct 10, 2019

Copy link
Member

@tim-one tim-one left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This surprised me - nice! Left some comments, but am too fried to do a thorough review tonight. It's certainly on the intended track.

Copy link
Member

@tim-one tim-one left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:Left more small comments. Higher level: does check_garbage() really serve a purpose anymore? It's doing 2/3rds of the work done by the new function just to see whether at least one thing has become directly reachable. Which the new function does anyway. Since the amount of trash is usually small, the relative expense of running the new function all the time is minor (it's not crawling over the entire generation, just the trash objects - which is usually an empty list!).

The fewer moving parts the better 😉

@pablogsal
Copy link
Member Author

pablogsal commented Oct 11, 2019

Higher level: does check_garbage() really serve a purpose anymore? It's doing 2/3rds of the work done by the new function just to see whether at least one thing has become directly reachable. Which the new function does anyway. Since the amount of trash is usually small, the relative expense of running the new function all the time is minor (it's not crawling over the entire generation, just the trash objects - which is usually an empty list!).

I very much agree with this! I did not have removed check_garbage because I wanted to make sure everything else looks fine before discussing about just doing the work every time (calling deduce_unreachable over unreachable).

On the other hand check_garbage is setting clearing the flags (it calls gc_clear_collecting) so someone else needs to do that if we remove the call. Where do you think is the best place to do this if we remove the function?

To make sure our assumption is correct, I will run the pyperformance test suite :) (Although I think is not really necesary, but checking does not hurt),

@tim-one
Copy link
Member

tim-one commented Oct 11, 2019

On the other hand check_garbage is setting clearing the flags (it calls gc_clear_collecting) so someone else needs to do that if we remove the call. Where do you think is the best place to do this if we remove the function?

The bit-fiddling obfuscation in this module is depressing ☹️ There are no principled answers: the only scheme at work appears to be "delay undoing a trick as long as possible, regardless of how long that leaves things in a variety of semi-broken states, or how obscure they are".

So repair all the remaining damage before delete_garbage() is called? That's the last non-trivial thing collection does.

@pablogsal
Copy link
Member Author

pablogsal commented Oct 11, 2019

The bit-fiddling obfuscation in this module is depressing ☹️ There are no principled answers: the only scheme at work appears to be "delay undoing a trick as long as possible, regardless of how long that leaves things in a variety of semi-broken states, or how obscure they are".

Yep, modifying this code is surprisingly tricky as there are lots of contracts not precisely well documented and many optimization details make the thing even more obscure.

So repair all the remaining damage before delete_garbage() is called? That's the last non-trivial thing collection does.

Unless we somehow introduce the call to gc_clear_collecting into some existing function, we will end implementing 90% of the work that check_garbage is doing if we add another function that iterates over the lists to call gc_clear_collecting, as the main overhead is the iteration over these lists.

@tim-one
Copy link
Member

tim-one commented Oct 12, 2019

I'm not looking to do less work than check_garbage, just looking to do it once instead of twice. More, I think it's a Bad Idea to follow this module's lead of being as "clever" as possible with no regard for how brittle that makes the code. Do the simplest & most straightforward thing instead?

Looking at older versions of this code is good for the soul 😃 Neil's original was a joy to work with, subtle algorithms but coded in completely straightforward ways. I dare say I improved some on that, by introducing a concrete notion of "state", with well defined transitions that were trivial to implement and query. Pointers were always pointers, ints were always ints ...

But the gc header struct had 3 members then (next, prev, and refcount copy). All the pain was introduced to get rid of the header's refcount member, and almost nothing was straightforward anymore. It's not a trade-off I would have made, but it's certainly defensible.

In any case, the ubiquitous bit fiddling now also slowed the code. I don't at all mind making it a bit slower still to preserve some sanity 😜

@pablogsal
Copy link
Member Author

pablogsal commented Oct 12, 2019

@tim-one Ok, after some design iterations, I am more or less happy with the shape of
7ccc564 . This commit removes the check_garbage function as per our discussion and encapsulates all the work for handling resurrecting objects in a new function. Better names are welcomed 😜. I think is now a bit more clear and makes things a bit more sane :)

Note: I have rebased to pick the latest GC_DEBUG changes that we made in master.

@pablogsal pablogsal force-pushed the bpo-38379 branch 3 times, most recently from 3fbbc5c to c27fa80 Compare October 12, 2019 04:07
Copy link
Member

@tim-one tim-one left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Left some last minor comments.

Copy link
Member

@tim-one tim-one left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to ship. Nice work!

@pablogsal
Copy link
Member Author

Thank you very much for the thorough review! :)

@pablogsal pablogsal merged commit 466326d into python:master Oct 13, 2019
@pablogsal pablogsal deleted the bpo-38379 branch October 13, 2019 15:49
jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019
…jects resurrect (pythonGH-16687)

Currently if any finalizer invoked during garbage collection resurrects any object, the gc gives up and aborts the collection. Although finalizers are assured to only run once per object, this behaviour of the gc can lead to an ever-increasing memory situation if new resurrecting objects are allocated in every new gc collection.

To avoid this, recompute what objects among the unreachable set need to be resurrected and what objects can be safely collected. In this way, resurrecting objects will not block the collection of other objects in the unreachable set.
shihai1991 pushed a commit to shihai1991/cpython that referenced this pull request Jan 31, 2020
…jects resurrect (pythonGH-16687)

Currently if any finalizer invoked during garbage collection resurrects any object, the gc gives up and aborts the collection. Although finalizers are assured to only run once per object, this behaviour of the gc can lead to an ever-increasing memory situation if new resurrecting objects are allocated in every new gc collection.

To avoid this, recompute what objects among the unreachable set need to be resurrected and what objects can be safely collected. In this way, resurrecting objects will not block the collection of other objects in the unreachable set.
oraluben pushed a commit to oraluben/cpython that referenced this pull request Mar 5, 2024
…jects resurrect (pythonGH-16687)

Currently if any finalizer invoked during garbage collection resurrects any object, the gc gives up and aborts the collection. Although finalizers are assured to only run once per object, this behaviour of the gc can lead to an ever-increasing memory situation if new resurrecting objects are allocated in every new gc collection.

To avoid this, recompute what objects among the unreachable set need to be resurrected and what objects can be safely collected. In this way, resurrecting objects will not block the collection of other objects in the unreachable set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants