Offline, Jeremy, Neil, Tim & I figured out what was really the case here. I believe the last thing we reported here was that we'd found a sample program that required several gc.collect() calls to clean out all its garbage, which surprised Neil and Tim. Since they know the collector inside out, if it surprises them, there's probably a bug. Here's the analysis. A new-style instance increfs its class (its ob_type), to make sure that the class doesn't go away as long as the instance exists. But the tp_traverse handler for new-style instances didn't visit the ob_type reference, so the collector thinks there are "outside" references to the class, and doesn't collect it. When the last instance goes away, the refcnt to the class finally matches what the collector sees, and it can collect the class (once all other references to it are gone of course). I was tempted to say "this ain't so bad, let's keep it that way." But Jeremy and Tim came up with counterexamples involving a cycle between a class and its instance, where the cycle would never be broken. Tim found that this program grows without bounds, though ever slower, since it spends more and more time in the 2nd generation collector, where all the uncollected objects eventually end up: while 1: class A(object): pass A.a = A() I tried the simplest possible fix, which was to visit self->ob_type in the new-style instance tp_traverse handler (subtype_traverse() in typeobject.c). But this caused assertions to fail all over the place. It turns out that when the collector decides to break a cycle like this, it calls the tp_clear handler for each object in the cycle, and then the subsequent deletion of the instance references the type in ways that have been made invalid by the clearing of the type. So this was a dead end. So here's a patch that does do the right thing. It adds a new field to type objects, tp_dependents. This is essentially a second reference count, counting the instances and direct subclasses. As long as tp_dependents is nonzero, this means there are still instances or subclasses, and then the type's tp_clear handler doesn't do anything. A consequence of the patch is that there will always be examples that take more than one collection to clean their garbage -- but eventually all garbage will be cleared out. (I suppose a worst-case example would be a very long chain of subclasses, which would be cleared out once class per collection.) Consequently, I'm patching test_gc.py to get rid of garbage left behind by previous tests in a loop. A downside of the patch is that it adds a new field to the type object structure; I believe this prevents it from being a backport candidate to 2.2.2. For a half blissful hour I believed that it would be possible to do something much simpler by doubling the regular refcnt; then I realized that the double refcnt would mean the collector would never break the cycle. Still, I am wishing for a solution that avoids adding a new field. Please comment on this patch! Index: Include/object.h =================================================================== RCS file: /cvsroot/python/python/dist/src/Include/object.h,v retrieving revision 2.101 diff -c -r2.101 object.h *** Include/object.h 12 Apr 2002 01:57:06 -0000 2.101 --- Include/object.h 8 Jun 2002 13:09:14 -0000 *************** *** 292,297 **** --- 292,298 ---- PyObject *tp_cache; PyObject *tp_subclasses; PyObject *tp_weaklist; + int tp_dependents; #ifdef COUNT_ALLOCS /* these must be last and never explicitly initialized */ Index: Objects/typeobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/typeobject.c,v retrieving revision 2.148 diff -c -r2.148 typeobject.c *** Objects/typeobject.c 4 Jun 2002 19:52:53 -0000 2.148 --- Objects/typeobject.c 8 Jun 2002 13:09:15 -0000 *************** *** 218,225 **** memset(obj, '\0', size); ! if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) Py_INCREF(type); if (type->tp_itemsize == 0) PyObject_INIT(obj, type); --- 218,227 ---- memset(obj, '\0', size); ! if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_INCREF(type); + type->tp_dependents++; + } if (type->tp_itemsize == 0) PyObject_INIT(obj, type); *************** *** 290,295 **** --- 292,303 ---- } } + if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) { + int err = visit((PyObject *)type, arg); + if (err) + return err; + } + if (basetraverse) return basetraverse(self, visit, arg); return 0; *************** *** 464,469 **** --- 472,478 ---- /* Can't reference self beyond this point */ if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_DECREF(type); + type->tp_dependents--; } } *************** *** 1170,1175 **** --- 1179,1192 ---- Py_INCREF(base); type->tp_base = base; + /* Incref the bases' tp_dependents count */ + for (i = 0; i < nbases; i++) { + PyTypeObject *b; + b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i); + if (PyType_Check(b) && (b->tp_flags & Py_TPFLAGS_HEAPTYPE)) + b->tp_dependents++; + } + /* Initialize tp_dict from passed-in dict */ type->tp_dict = dict = PyDict_Copy(dict); if (dict == NULL) { *************** *** 1431,1441 **** --- 1448,1475 ---- static void type_dealloc(PyTypeObject *type) { + PyObject *bases; etype *et; /* Assert this is a heap-allocated type object */ assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE); _PyObject_GC_UNTRACK(type); + + /* Decref the bases' tp_dependents count */ + bases = type->tp_bases; + if (bases) { + int i, nbases; + assert(PyTuple_Check(bases)); + nbases = PyTuple_GET_SIZE(bases); + for (i = 0; i < nbases; i++) { + PyTypeObject *b; + b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i); + if (PyType_Check(b) && + (b->tp_flags & Py_TPFLAGS_HEAPTYPE)) + b->tp_dependents--; + } + } + PyObject_ClearWeakRefs((PyObject *)type); et = (etype *)type; Py_XDECREF(type->tp_base); *************** *** 1495,1502 **** etype *et; int err; ! if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) ! return 0; et = (etype *)type; --- 1529,1535 ---- etype *et; int err; ! assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE); et = (etype *)type; *************** *** 1524,1533 **** type_clear(PyTypeObject *type) { etype *et; ! PyObject *tmp; ! if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) ! return 0; et = (etype *)type; --- 1557,1583 ---- type_clear(PyTypeObject *type) { etype *et; ! PyObject *tmp, *bases; ! assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE); ! ! if (type->tp_dependents) ! return 0; /* Not yet, there are still instances */ ! ! /* Decref the bases' tp_dependents count */ ! bases = type->tp_bases; ! if (bases) { ! int i, nbases; ! assert(PyTuple_Check(bases)); ! nbases = PyTuple_GET_SIZE(bases); ! for (i = 0; i < nbases; i++) { ! PyTypeObject *b; ! b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i); ! if (PyType_Check(b) && ! (b->tp_flags & Py_TPFLAGS_HEAPTYPE)) ! b->tp_dependents--; ! } ! } et = (etype *)type; *************** *** 1754,1763 **** --- 1804,1815 ---- } if (new->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_INCREF(new); + new->tp_dependents++; } self->ob_type = new; if (old->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_DECREF(old); + old->tp_dependents--; } return 0; } Index: Lib/test/test_gc.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/test/test_gc.py,v retrieving revision 1.14 diff -c -r1.14 test_gc.py *** Lib/test/test_gc.py 28 Mar 2002 21:22:25 -0000 1.14 --- Lib/test/test_gc.py 8 Jun 2002 13:09:15 -0000 *************** *** 220,228 **** def test(): if verbose: print "disabling automatic collection" enabled = gc.isenabled() gc.disable() ! verify(not gc.isenabled() ) debug = gc.get_debug() gc.set_debug(debug & ~gc.DEBUG_LEAK) # this test is supposed to leak --- 220,229 ---- def test(): if verbose: print "disabling automatic collection" + while gc.collect(): pass # collect garbage from previous tests enabled = gc.isenabled() gc.disable() ! verify(not gc.isenabled()) debug = gc.get_debug() gc.set_debug(debug & ~gc.DEBUG_LEAK) # this test is supposed to leak --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4