[Jason Evans] > Since the spring of 2003, I have been developing Crux, which is a computer > program for phylogenetic inferencing (bioinformatics) research. In March > of 2004, I switched Crux to using Python from having used a different > embeddable interpreter. For the most part, I have been very happy with > Python, but Python's garbage collector has been a major source of > frustration. Then you're not using it correctly -- it's easy if you are. You didn't show actual code, though, and I don't have time to guess what you might have done. > ... > The important aspect of Crux is that it deals with trees. The details shouldn't matter; Python's cyclic gc handles arbitrary graph structures, including multiple self-loops if you want them. The Python core creates graph structures of boggling complexity, so it might help you to study how core objects implement their tp_traverse and tp_clear methods. They're mostly very simple. Where they're not very simple, it's usually because someone had a silly idea of "optimization" in mind. ... > 1) I originally had lone rings refer to themselves (ob_refcnt started out > at 3; 2 self-references and one reference held by the associated edge). > This didn't work. "Didn't work" leaves the reader guessing about everything. Showing code, and being specific about both what you expected and what you observed, are necessary. python-dev isn't really the right place for that, though. > It appears that the cyclic GC does not actually calculate the number of live > references to objects (references that are reachable by traversing all objects > accessible from the root set); Python's gc has no "root set" concept, so thinking in those terms won't help. It won't grow a root-set concept, either: "foreign" extension modules (like Tcl/Tk, or library packages written in Fortran) can hold what would normally be considered root-set objects for Python, and that's fine. They're not required to cooperate with Python's memory management beyond respecting Python's refcount rules. Forcing modules (via some new API) to identify root-set objects they own would be backward-incompatible (not to mention a major new burden on extension authors and users), so won't happen. > instead it assumes that if tp_clear() doesn't drop the reference count > to a number that equals the number of references from live objects, > there must still be references from live objects. Unfortunately, > visit()ing self does not work, so there is no way to convince Python > that all references are from unreachable objects. Again I can't guess what "does not work" means. But I can assure you it "would work" if I wrote it <wink>. It's easy: an object's tp_traverse must call the visit() callback exactly once for each non-NULL PyObject* *directly* (in one step) reachable from the object, and that's all. If self is reachable from self via a refcounted pointer held by self, then self's tp_traverse() implementation must call visit(self, arg). But I think you're getting off track even *caring* what your pointers point at. If some object type T has three Py_Object* pointers pa, pb, pc, then using Python 2.4's Py_VISIT macro, T's tp_traverse should be coded static int T_traverse(T *self, visitproc visit, void *arg) { Py_VISIT(self->pa); Py_VISIT(self->pb); Py_VISIT(self->pc); return 0; } It doesn't matter what pa (etc) points at. If pa points to self, fine; if it doesn't, also fine. > Working around this in Crux requires a lot of extra reference counting > complexity, because there are three different cases for reference > counts, depending on how many members there are in a ring (1, 2, or 3+ > members). Please show code. If, e.g., you have a vector v of length n holding the PyObject* "members", then int i; for (i = 0; i < self->n; ++i) Py_VISIT(self->v[i]); return 0; is an appropriate tp_traverse(). For example, that's exactly what the tp_traverse() for Python's builtin list objects does, and a Python list L can certainly be an element of itself: >>> L = [] >>> L.append(L) >>> L[0] is L True >>> There's no special-casing needed for the number of containees, or for the nature of what they do or don't point at. It's also irrelevant what can be reached *from* the direct containees. > 2) This issue is really just a different manifestation of issue (1). > > At the C (not Python object) level, each node only actually stores a > pointer to a single member of the associated ring. Please show code. If the C object contains exactly one pointer, then its tp_traverse should call visit() exactly once; etc. The correct C-level code has entirely to do with what the C struct contains, and nothing to do with the view presented at the Python level. > Given a single ring member, it is possible to traverse the ring and reach all > other ring members. That's irrelevant to gc. tp_traverse() is only responsible for revealing the objects *directly* reachable from self, or, more precisely, for revealing the PyObject* pointers on which self owns a reference. gc computes the transitive closure itself; tp_traverse() isn't responsible for that. > As mentioned in issue (1), the cyclic GC expects tp_traverse() > to call visit() once for each reference held. Yes. > It is not enough for a node to visit() one ring member; Don't know what it means for "a node to visit()". If the C struct has exactly one pointer, then it is enough for visit() to be invoked exactly once with that pointer. It would be incorrect not to visit() that pointer, and it would be incorrect to visit() more pointers than just that one. > it must visit() all ring members, If and only if for each ring member M, self contains a pointer directly to M. Showing code is much better than English. View your C-level objects as a directed graph. tp_traverse at a node `self` is reponsible for revealing self's outgoing edges, no less and no more than that. > in order for the GC to count how many references are from unreachable > objects, versus reachable from the root set. > > In summary, issues (1) and (2) are due to how the cyclic GC does the > "marking" phase of mark/sweep GC. It's not a mark/sweep collector. > My expectation of mark/sweep GC is You expectation is irrelevant <wink>, since it's not a mark/sweep collector. > that it should be sufficient to assure that all objects reachable from > the root set are visit()ed at least once; it should not be important how > many times each unreachable object is visit()ed. Forget roots sets and mark/sweep collectors. If you want to know how Python's gc actually works, read the comments in gcmodule.c's collect() function, and drill down from there as deep as your curiosity takes you. > I don't have a deep enough understanding of the Python interpreter to > give a detailed suggestion for improvement. I suspect that determining > the root set is not an easy operation; if this is the case, then I think > we're stuck with the current design. We are, but I don't feel "stuck" with it -- it's a very good design. > If determining the root set *is* easy (or even possible), then I would suggest > using a standard mark/sweep collector, where unreachable objects are > scheduled for destruction. The root set is impossible to determine without active cooperation from extension modules. Python's gc can infer the existence of roots, but not their identies; it can infer the transitive closure of what's reachable from the root set intersected with the set of objects that have registered with gc, but knows nothing about the roots themselves. > tp_traverse(), tp_clear(), and tp_dealloc() would retain > the same structure; the only difference would be the logic that > determines which objects can be destroyed. > > 3) A strange thing can happen when tp_clear() is called. Suppose that an > edge object is being cleared, and it drops its references to the > associated rings. If ob_refcnt of one of the rings drops to 0 as a > result, Python will tp_dealloc() the ring *right* *now*, without ever > calling tp_clear() on the ring. That depends on how the tp_dealloc() for the ring is coded, so is up to you. The only *necessary* thing for tp_clear() to do is to clear enough pointers to break cycles. Some people do only that much. For example, Python tuples don't even implement tp_clear, because a cycle involving tuples necessarily involves non-tuple objects too. Other people write tp_clear() so that it can be called from their tp_dealloc() function too. That's up to them. If you want to do the latter, it's good to write tp_clear() in such a way that calling it more than once is harmless; e.g., if (self->pa) { Py_DECREF(self->pa); self->pa = NULL; } etc or, using 2.4's Py_CLEAR macro, Py_CLEAR(self->pa); etc. > That means that I have to keep track of whether tp_clear() has been called > on objects, if it is at all important that tp_clear() be called, so that I can > manually do so in tp_dealloc(), As above, make your tp_clear() idempotent and then you don't have to keep track of anything. It's usually good practice for a tp_clear() to be robust against NULL pointers anyway. > if necessary. It is in my opinion reasonable to have cleanup code in > tp_clear(), with the assumption that it will be called precisely once, > but Python makes me do extra work to make sure that this happens. You won't get a guarantee that tp_clear() will be called exactly once. You have a guarantee that tp_dealloc() will be called no more than once. It's actually unusual to find a tp_dealloc() that calls its type's tp_clear(), but I think it's a nice pattern when you can get away with it. In such cases, tp_clear() should be coded so that it can be called multiple times, and that's generally very easy to do. > This should be pretty easy to change. A single bit per object is needed > to keep track of whether tp_clear() has been called. I think this only > needs to be done for object types that support cyclic GC. There's no such thing as "one bit" in reality: because of malloc memory padding, "one more bit" would require adding 4 bytes to every gc'able object. That's an extraordinary expense. Luckily, it's not needed. > 4) There is no way to control the order in which objects are > tp_dealloc()ed. It's true that there's no direct way to control the order. > This is a problem for the R-E-R construct, since at a low level, these objects > are always linked together. Many things are always linked together in the core too. Why that's "a problem" for you isn't clear. It's not a problem in the core, for example. > What I would like to do is defer tp_dealloc() on the edge until after both > rings have been deleted. Instead, I am forced to use a reference-counted > deletion function. Sorry, don't know what that means, and (as above) don't know why you care. > Not calling self->ob_type->tp_free() on the edge in tp_dealloc() until > later is not a reasonable option, because this defers deletion of the > edge until a later round of garbage collection. > > This could be addressed in the Python interpreter by paying heed to the > return value of tp_dealloc(). If the return value is non-zero, move the > object to the end of the list of objects to be destroyed, Except there is no list of objects to be destroyed. > so that destruction is tried later. This allows the module to impose its own > destruction ordering. That part would need clearer motivation, preferably in the form of a PEP.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4