[Tim, on <http://www.python.org/sf/843455>] > ... > It's exactly the scheme I described, and the coding went smoothly > because it was something that could be (and was) fully thought-out in > advance. That doesn't rule out conceptual or coding errors, though. As I noted on the patch in the wee hours, "conceptual errors" wins. I out-thought a wrong thing, but one that happened to be good enough to fix all the new test cases: it doesn't really matter which objects are reachable from the objects whose deaths trigger callbacks, what really matters is which objects are reachable from the callbacks themselves. The test cases were so incestuous (objects all pointing to each other) that those turned out to be the same sets, but that's not a necessary outcome -- although it appears to be a likely outcome. Here's one that's surprising after the patch: """ import weakref, gc class C: def cb(self, ignore): print self.__dict__ c1, c2 = C(), C() c2.me = c2 c2.c1 = c1 c2.wr = weakref.ref(c1, c2.cb) del c1, c2 print 'about to collect' gc.collect() print 'collected' """ The callback triggers on the death of c1 then, but c1 isn't in a cycle at all (it's hanging *off* a cycle), and c2 isn't reachable from c1. But c2 is reachable from the callback. c2 is in a self-cycle via c2.me, and in another via c2.wr (which indirectly points back to c2 via the weakref's bound method object c2.cb). After the patch, c1 ends up in the set of objects with an associated weakref callback, but c2 isn't reachable from that set so tp_clear is called on c2. That destroys c2's __dict__ before the callback can get invoked, so when c1 dies the callback sees a tp_clear'ed c2: about to collect {} collected I know it's hard for people to get excited about an empty dict <wink>. But that's not the point: the point is that if it's possible to expose an object that's been tp_clear'ed to Python code, then *anything* can happen. For example, this minor variation segfaults after the patch, right after printing "about to collect": """ import weakref, gc class C(object): def cb(self, ignore): print self.__dict__ class D: pass c1, c2 = D(), C() c2.me = c2 c2.c1 = c1 c2.wr = weakref.ref(c1, c2.cb) del c1, c2, C, D print 'about to collect' gc.collect() print 'collected' """ That class C was reachable from c1 in the first example protected C from getting tp_clear'ed at all, which was something the patch was trying to accomplish. But by giving c1 a different class, C's tp_clear immunity went away, but C is still reachable from the callback. Boom. So what's reachable from a callback? If the callback is not *itself* part of the garbage getting collected, then it acts like an external root, and so nothing reachable from the callback is part of the garbage getting collected either. gc has no worries then. If the callback itself is part of the garbage getting collected, then the weakref holding the callback must also be part of the garbage getting collected (else the weakref holding the callback would act as an external root, preventing the callback from being part of the garbage being collected too). My thought then was that a simpler scheme could simply call tp_clear on the trash weakrefs first. Calling tp_clear on a weakref just throws away the associated callbacks (if any) unexecuted, and if they don't get run then we have no reason to care what's reachable from them anymore. The fly in that ointment appears to be that a callback can itself be the target of a weakref, so that when the callback is thrown away, it can trigger calling another callback. At that point I feel asleep muttering unspeakable oaths.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4