[Tim] > This example bothers me a little, under current Python CVS: > > """ > import gc, weakref > > def C_gone(ignored): > print "An object of type C went away." > > class C: > def __init__(self): > self.wr = weakref.ref(self, C_gone) > > print "creating non-cyclic C instance" > c = C() > > print "creating cyclic C instance and destroying old one" > c = C() # triggers the print > c.loop = c > > print "getting rid of the cyclic one" > c = None # doesn't trigger the print > gc.collect() # ditto > """ Recapping, it bothers me because the callback doesn't trigger now when c is part of cyclic trash, but if c.wr were an instance with a __del__ method instead, the __del__ method *would* trigger. There are (at least) two easy-to-code and pretty-easy-to-explain ways to get that callback invoked safely. One I sketched last time, and is in the attached patch-callback.txt. (Patches here just have gcmodule.c code changes, no corresponding changes to comments.) The difference is that it invokes a callback on a weakref to trash iff the callback is reachable, instead of invoking a callback on a weakref to trash iff the weakref it's attached to is reachable. All the tests we've accumulated still pass with this change, and the example above does trigger a second print. Another approach is in the attached patch-finalizer.txt. This is more (I think) like what Neil was aiming at last week, and amounts to changing has_finalizer() to return true when it sees a weakref with a callback. Then the weakref, and everything reachable from it, gets moved to the reachable set. After that, no callback can access an unreachable object directly, so callbacks are safe to call on that count. We still (and in the other patch too) clear all weakrefs to trash before breaking cycles, so callbacks can't resurrect trash via active weakrefs either. But in the patch-finalizer case, the callbacks attached to weakrefs moved to the reachable set no longer get invoked "artificially" (forced by gc) -- they trigger iff their referents eventually go away. Some current tests fail with the patch-finalizer approach, but for shallow reasons: fewer callbacks get invoked, and some of the tests currently use "global" (reachable) weakref callbacks as a way to verify that gc really did clean up cyclic trash. Those stop getting invoked. Some of the tests use bound method objects as callbacks, and when a weakref contains a callback that's a bound method object, and the weakref gets moved to the reachable set, its callback gets moved there too, and so the instance wrt which the callback is a bound method becomes reachable too, and so the class of that instance also becomes reachable. As a result, gc doesn't get rid of any of that stuff, so the "external" callbacks never trigger. Instead, the weakrefs with the bound-method callbacks end up in gc.garbage. I don't like that, because if it's possible to get rid of __del__ methods, I'd really like to see gc.garbage go away too. Nothing we've done with weakrefs so far can add anything to gc.garbage, so I like them better on that count. Pragmatically, sticking a weakref in gc.garbage isn't of much use to the end-user now either, because a weakref's callback is neither clearable nor even discoverable now at the Python level. But that could be changed too. What I'm missing is a pile of practical <heh> use cases to distinguish these. Let's look at weak value dicts, for reasons that will become clear. An entry in a WVD d now is of the form key: W where W is an instance of a subclass of weakref.ref, and W has the associated key as an attribute. Every value in d shares the same callback, set up by WVD.__init__(): def __init__(self, *args, **kw): ... def remove(wr, selfref=ref(self)): self = selfref() if self is not None: del self.data[wr.key] self._remove = remove So the callback has a weakref to d, and when it's invoked extracts the specific key from the weakref instance and uses that to delete the appropriate key: W entry from d via the shared callback's own weakref to d. Now it's not clear *why* the callback has a weakref to d. It *could* have been written like this instead: def remove(wr): del self.data[wr.key] In fact, at one time it was written in that kind of way. Rev 1.13 of weakref.py changed it to use a weakref to self, and the motivations in the patch that started this are feeble: http://www.python.org/sf/417795 "It's bad to rely on the garbage collector" and "this could lead to uncollectable cycles if a subclass of Weak*Dictionary defines __del__". The proper responses were "no, it isn't" and "insane code deserves whatever it gets" <0.5 wink>. Anyway, if WVD *had* been left simple(r), it would be relevant to the choices above: if WVD d became part of cyclic trash, patch-finalizer would move all its weakrefs to the reachable set, and then d itself would be reachable too (via the shared callback's strong reference to self). As a result, none of the callbacks would trigger, and d would "leak" (it wouldn't get cleaned up, and wouldn't show up in gc.garbage either -- it would end up in an older generation, clogging gc forever after because it would remain forever after reachable from its (resurrected) weakrefs in gc.garbage). Under patch-callback, d would still go away (the simplest way to understand why is just that d *is* in cyclic trash, and patch-callback never moves anything to the reachable set because of weakrefs or weakref callbacks). The *current* implementation of WVD doesn't have this problem under patch-finalizer. If d is part of cyclic trash, all its weakrefs get moved to the reachable set, but d isn't directly reachable from their shared callback, so d is not moved to the reachable set. d's tp_clear() gets invoked by gc in its cycle-breaking phase, and that decrefs all of d's weakrefs that were moved to the unreachable set, and they go away then (without triggering their callbacks) via the normal refcount route. The point is that this is all pretty subtle, and it bothers me that "the obvious" way to implement WVD (the pre-patch-417795 way) would lead to a worse outcome under patch-finalizer, so I also fear that Python's current weakref users may be doing similar things now that would also suffer under patch-finalizer. OTOH, there's something conceptually nice about patch-finalizer -- it is the easiest approach to explain (well, except for approaches that allow segfaults to occur ...). On the third hand, gc.garbage appears poorly understood by users now, and patch-finalizer is actually easy to explain only to those who already understand gc.garbage. -------------- next part -------------- Index: Modules/gcmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/gcmodule.c,v retrieving revision 2.80 diff -u -r2.80 gcmodule.c --- Modules/gcmodule.c 1 Nov 2004 16:39:57 -0000 2.80 +++ Modules/gcmodule.c 4 Nov 2004 03:37:50 -0000 @@ -566,9 +566,8 @@ * to imagine how calling it later could create a problem for us. wr * is moved to wrcb_to_call in this case. */ - if (IS_TENTATIVELY_UNREACHABLE(wr)) + if (IS_TENTATIVELY_UNREACHABLE(wr->wr_callback)) continue; - assert(IS_REACHABLE(wr)); /* Create a new reference so that wr can't go away * before we can process it again. @@ -577,9 +576,8 @@ /* Move wr to wrcb_to_call, for the next pass. */ wrasgc = AS_GC(wr); - assert(wrasgc != next); /* wrasgc is reachable, but - next isn't, so they can't - be the same */ + if (wrasgc == next) + next = wrasgc->gc.gc_next; gc_list_move(wrasgc, &wrcb_to_call); } } @@ -593,7 +591,6 @@ gc = wrcb_to_call.gc.gc_next; op = FROM_GC(gc); - assert(IS_REACHABLE(op)); assert(PyWeakref_Check(op)); wr = (PyWeakReference *)op; callback = wr->wr_callback; @@ -621,6 +618,7 @@ if (wrcb_to_call.gc.gc_next == gc) { /* object is still alive -- move it */ gc_list_move(gc, old); + gc->gc.gc_refs = GC_REACHABLE; } else ++num_freed; -------------- next part -------------- Index: Modules/gcmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/gcmodule.c,v retrieving revision 2.80 diff -u -r2.80 gcmodule.c --- Modules/gcmodule.c 1 Nov 2004 16:39:57 -0000 2.80 +++ Modules/gcmodule.c 4 Nov 2004 03:27:10 -0000 @@ -413,6 +413,9 @@ assert(delstr != NULL); return _PyInstance_Lookup(op, delstr) != NULL; } + else if (PyWeakref_Check(op) && + ((PyWeakReference *)op)->wr_callback != NULL) + return 1; else if (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE)) return op->ob_type->tp_del != NULL; else @@ -566,10 +569,6 @@ * to imagine how calling it later could create a problem for us. wr * is moved to wrcb_to_call in this case. */ - if (IS_TENTATIVELY_UNREACHABLE(wr)) - continue; - assert(IS_REACHABLE(wr)); - /* Create a new reference so that wr can't go away * before we can process it again. */
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4