Warning: long message. If you're not interested in reading all this, please skip to "Conclusion" at the end. At Tim's recommendation I had a look at what section 12.6 of the Java language spec says about finalizers. The stuff there is sure seductive for language designers... Have a look at te diagram at http://java.sun.com/docs/books/jls/html/12.doc.html#48746. In all its (seeming) complexity, it helped me understand some of the issues of finalization better. Rather than the complex 8-state state machine that it appears to be, think of it as a simple 3x3 table. The three rows represent the categories reachable, finalizer-reachable (abbreviated in the diagram as f-reachable), and unreachable. These categories correspond directly to categories of objects that the Schemenauer-Tiedemann cycle-reclamation scheme deals with: after moving all the reachable objects to the second list (first the roots and then the objects reachable from the roots), the first list is left with the unreachable and finalizer-reachable objects. If we want to distinguish between unreachable and finalizer-reachable at this point, a straightforward application of the same algorithm will work well: Create a third list (this will contain the finalizer-reachable objects). Start by filling it with all the objects from the first list (which contains the potential garbage at this point) that have a finalizer. We can look for objects that have __del__ or __clean__ or for which tp_clean(CARE_EXEC)==true, it doesn't matter here.(*) Then walk through the third list, following each object's references, and move all referenced objects that are still in the first list to the third list. Now, we have: List 1: truly unreachable objects. These have no finalizers and can be discarded right away. List 2: truly reachable objects. (Roots and objects reachable from roots.) Leave them alone. List 3: finalizer-reachable objects. This contains objects that are unreachable but have a finalizer, and objects that are only reachable through those. We now have to decide on a policy for invoking finalizers. Java suggests the following: Remember the "roots" of the third list -- the nodes that were moved there directly from the first list because they have a finalizer. These objects are marked *finalizable* (a category corresponding to the second *column* of the Java diagram). The Java spec allows the Java garbage collector to call all of these finalizers in any order -- even simultaneously in separate threads. Java never allows an object to go back from the finalizable to the unfinalized state (there are no arrows pointing left in the diagram). The first finalizer that is called could make its object reachable again (up arrow), thereby possibly making other finalizable objects reachable too. But this does not cancel their scheduled finalization! The conclusion is that Java can sometimes call finalization on unreachable objects -- but only if those objects have gone through a phase in their life where they were unreachable or at least finalizer-unreachable. I agree that this is the best that Java can do: if there are cycles containing multiple objects with finalizers, there is no way (short of asking the programmer(s)) to decide which object to finalize first. We could pick one at random, run its finalizer, and start garbage collection all over -- if the finalizer doesn't resurrect anything, this will give us the same set of unreachable objects, from which we could pick the next finalizable object, and so on. That looks very inefficient, might not terminate (the same object could repeatedly show up as the candidate for finalization), and it's still arbitrary: the programmer(s) still can't predict which finalizer in a cycle with multiple finalizers will be called first. Assuming the recommended characteristics of finalizers (brief and robust), it won't make much difference if we call all finalizers (of the now-finalizeable objects) "without looking back". Sure, some objects may find themselves in a perfectly reachable position with their finalizer called -- but they did go through a "near-death experience". I don't find this objectionable, and I don't see how Java could possibly do better for cycles with multiple finalizers. Now let's look again at the rule that an object's finalizer will be called at most once automatically by the garbage collector. The transitions between the colums of the Java diagram enforce this: the columns are labeled from left to right with unfinalized, finalizable, and finalized, and there are no transition arrows pointing left. (In my description above, already finalized objects are considered not to have a finalizer.) I think this rule makes a lot of sense given Java's multi-threaded garbage collection: the invocation of finalizers could run concurreltly with another garbage collection, and we don't want this to find some of the same finalizable objects and call their finalizers again! We could mark them with a "finalization in progress" flag only while their finalizer is running, but in a cycle with multiple finalizers it seems we should keep this flag set until *all* finalizers for objects in the cycle have run. But we don't actually know exactly what the cycles are: all we know is "these objects are involved in trash cycles". More detailed knowledge would require yet another sweep, plus a more hairy two-dimensional data structure (a list of separate cycles). And for what? as soon as we run finalizers from two separate cycles, those cycles could be merged again (e.g. the first finalizer could resurrect its cycle, and the second one could link to it). Now we have a pool of objects that are marked "finalization in progress" until all their finalizations terminate. For an incremental concurrent garbage collector, this seems a pain, since it may continue to find new finalizable objects and add them to the pile. Java takes the logical conclusion: the "finalization in progress" flag is never cleared -- and renamed to "finalized". Conclusion ---------- Are the Java rules complex? Yes. Are there better rules possible? I'm not so sure, given the requirement of allowing concurrent incremental garbage collection algorithms that haven't even been invented yet. (Plus the implied requirement that finalizers in trash cycles should be invoked.) Are the Java rules difficult for the user? Only for users who think they can trick finalizers into doing things for them that they were not designed to do. I would think the following guidelines should do nicely for the rest of us: 1. Avoid finalizers if you can; use them only to release *external* (e.g. OS) resources. 2. Write your finalizer as robust as you can, with as little use of other objects as you can. 3. Your only get one chance. Use it. Unlike Scheme guardians or the proposed __cleanup__ mechanism, you don't have to know whether your object is involved in a cycle -- your finalizer will still be called. I am reconsidering to use the __del__ method as the finalizer. As a compromise to those who want their __del__ to run whenever the reference count reaches zero, the finalized flag can be cleared explicitly. I am considering to use the following implementation: after retrieving the __del__ method, but before calling it, self.__del__ is set to None (better, self.__dict__['__del__'] = None, to avoid confusing __setattr__ hooks). The object call remove self.__del__ to clear the finalized flag. I think I'll use the same mechanism to prevent __del__ from being called upon a failed initialization. Final note: the semantics "__del__ is called whenever the reference count reaches zero" cannot be defended in the light of a migration to different forms of garbage collection (e.g. JPython). There may not be a reference count. --Guido van Rossum (home page: http://www.python.org/~guido/) ____ (*) Footnote: there's one complication: to ask a Python class instance if it has a finalizer, we have to use PyObject_Getattr(obj, ...). If the object's class has a __getattr__ hook, this can invoke arbitrary Python code -- even if the answer to the question is "no"! This can make the object reachable again (in the Java diagram, arrows pointing up or up and right). We could either use instance_getattr1(), which avoids the __getattr__ hook, or mark all class instances as finalizable until proven innocent.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4