We now have two implementations of Eric Tiedemann's idea: Neil and I both implemented it. It's too soon to post the patch sets (both are pretty rough) but I've got another design question. Once we've identified a bunch of objects that are only referring to each other (i.e., one or more cycles) we have to dispose of them. The question is, how? We can't just call free on each of the objects; some may not be allocated with malloc, and some may contain pointers to other malloc'ed memory that also needs to be freed. So we have to get their destructors involved. But how? Calling ob->ob_type->tp_dealloc(ob) for an object who reference count is unsafe -- this will destroy the object while there are still references to it! Those references are all coming from other objects that are part of the same cycle; those objects will also be deallocated and they will reference the deallocated objects (if only to DECREF them). Neil uses the same solution that I use when finalizing the Python interpreter -- find the dictionaries and call PyDict_Clear() on them. (In his unpublished patch, he also clears the lists using PyList_SetSlice(list, 0, list->ob_size, NULL). He's also generalized so that *every* object can define a tp_clear function in its type object.) As long as every cycle contains at least one dictionary or list object, this will break cycles reliably and get rid of all the garbage. (If you wonder why: clearing the dict DECREFs the next object(s) in the cycle; if the last dict referencing a particular object is cleared, the last DECREF will deallocate that object, which will in turn DECREF the objects it references, and so forth. Since none of the objects in the cycle has incoming references from outside the cycle, we can prove that this will delete all objects as long as there's a dict or list in each cycle. However, there's a snag. It's the same snag as what finalizing the Python interpreter runs into -- it has to do with __del__ methods and the undefined order in which the dictionaries are cleared. For example, it's quite possible that the first dictionary we clear is the __dict__ of an instance, so this zaps all its instance variables. Suppose this breaks the cycle, so then the instance itself gets DECREFed to zero. Its deallocator will be called. If it's got a __del__, this __del__ will be called -- but all the instance variables have already been zapped, so it will fail miserably! It's also possible that the __dict__ of a class involved in a cycle gets cleared first, in which case the __del__ no longer "exists", and again the cleanup is skipped. So the question is: What to *do*? My solution is to make an extra pass over all the garbage objects *before* we clear dicts and lists, and for those that are instances and have __del__ methods, call their __del__ ("by magic", as Tim calls it in another post). The code in instance_dealloc() already does the right thing here: it calls __del__, then discovers that the reference count is > 0 ("I'm not dead yet" :-), and returns without freeing the object. (This is also why I want to introduce a flag ensuring that __del__ gets called by instance_dealloc at most once: later when the instance gets DECREFed to 0, instance_dealloc is called again and will correctly free the object; but we don't want __del__ called again.) [Note for Neil: somehow I forgot to add this logic to the code; in_del_called isn't used! The change is obvious though.] This still leaves a problem for the user: if two class instances reference each other and both have a __del__, we can't predict whose __del__ is called first when they are called as part of cycle collection. The solution is to write each __del__ so that it doesn't depend on the other __del__. Someone (Tim?) in the past suggested a different solution (probably found in another language): for objects that are collected as part of a cycle, the destructor isn't called at all. The memory is freed (since it's no longer reachable), but the destructor is not called -- it is as if the object lives on forever. This is theoretically superior, but not practical: when I have an object that creates a temp file, I want to be able to reliably delete the temp file in my destructor, even when I'm part of a cycle! --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4