[Jack Jansen] > ... A long time ago, Dianne Hackborn actually implemented a scheme like this, under the name VREF (for "virtual reference", or some such). IIRC, differences from your scheme were mainly that: 1) There was an elaborate proxy mechanism to avoid having to explicitly strengthen the weak. 2) Each object contained a pointer to a linked list of associated weak refs. This predates DejaNews, so may be a pain to find. > ... > We add a new builtin function (or a module with that function) > weak(). This returns a weak reference to the object passed as a > parameter. A weak object has one method: strong(), which returns the > corresponding real object or raises an exception if the object doesn't > exist anymore. This interface appears nearly isomorphic to MIT Scheme's "hash" and "unhash" functions, except that their hash returns an (unbounded) int and guarantees that hash(o1) != hash(o2) for any distinct objects o1 and o2 (this is a stronger guarantee than Python's "id", which may return the same int for objects with disjoint lifetimes; the other reason object address isn't appropriate for them is that objects can be moved by garbage collection, but hash is an object invariant). Of course unhash(hash(o)) is o, unless o has been gc'ed; then unhash raises an exception. By most accounts (I haven't used it seriously myself), it's a usable interface. > ... > to implement this I need to add a pointer to every object. That's unattractive, of course. > ... > (actually: we could make do with a single bit in every object, with > the bit meaning "this object has an associated weak object". We could > then use a global dictionary indexed by object address to find the > weak object) Is a single bit actually smaller than a pointer? For example, on most machines these days #define PyObject_HEAD \ int ob_refcnt; \ struct _typeobject *ob_type; is two 4-byte fields packed solid already, and structure padding prevents adding anything less than a 4-byte increment in reality. I guess on Alpha there's a 4-byte hole here, but I don't want weak pointers enough to switch machines <wink>. OTOH, sooner or later Guido is going to want a mark bit too, so the other way to view this is that 32 new flag bits are as cheap as one <wink>. There's one other thing I like about this: it can get rid of the dicey > Strong() checks that self->object->weak == self and returns > self->object (INCREFfed) if it is. check. If object has gone away, you're worried that self->object may (on some systems) point to a newly-invalid address. But worse than that, its memory may get reused, and then self->object may point into the *middle* of some other object where the bit pattern at the "weak" offset just happens to equal self. Let's try a sketch in pseduo-Python, where __xxx are secret functions that do the obvious things (and glossing over thread safety since these are presumably really implemented in C): # invariant: __is_weak_bit_set(obj) == id2weak.has_key(id(obj)) # So "the weak bit" is simply an optimization, sparing most objects # from a dict lookup when they die. # The invariant is delicate in the presence of threads. id2weak = {} class _Weak: def __init__(self, obj): self.id = id(obj) # obj's refcount not bumped __set_weak_bit(obj) id2weak[self.id] = self # note that "the system" (see below) sets self.id # to None if obj dies def strong(self): if self.id is None: raise DeadManWalkingError(self.id) return __id2obj(self.id) # will bump obj's refcount def __del__(self): # this is purely an optimization: if self gets nuked, # exempt its referent from greater expense when *it* # dies if self.id is not None: __clear_weak_bit(__id2obj(self.id)) del id2weak[self.id] def weak(obj): return id2weak.get(id(obj), None) or _Weak(obj) and then whenever an object of any kind is deleted the system does: if __is_weak_bit_set(obj): objid = id(obj) id2weak[objid].id = None del id2weak[objid] In my current over-tired state, I think that's safe (modulo threads), portable and reasonably fast; I do think the extra bit costs 4 bytes, though. > ... > The weak object isn't transparent, because you have to call strong() > before you can do anything with it, but this is an advantage (says he, > aspiring to a career in politics or sales:-): with a transparent weak > object the object could disappear at unexpected moments and with this > scheme it can't, because when you have the object itself in hand you > have a refcount too. Explicit is better than implicit for me. [M.-A. Lemburg] > Have you checked the weak reference dictionary implementation > by Dieter Maurer ? It's at: > > http://www.handshake.de/~dieter/weakdict.html A project where I work is using it; it blows up a lot <wink/frown>. While some form of weak dict is what most people want in the end, I'm not sure Dieter's decision to support weak dicts with only weak values (not weak keys) is sufficient. For example, the aforementioned project wants to associate various computed long strings with certain hashable objects, and for some reason or other (ain't my project ...) these objects can't be changed. So they can't store the strings in the objects. So they'd like to map the objects to the strings via assorted dicts. But using the object as a dict key keeps it (and, via the dicts, also its associated strings) artificially alive; they really want a weakdict with weak *keys*. I'm not sure I know of a clear & fast way to implement a weakdict building only on the weak() function. Jack? Using weak objects as values (or keys) with an ordinary dict can prevent their referents from being kept artificially alive, but that doesn't get the dict itself cleaned up by magic. Perhaps "the system" should notify a weak object when its referent goes away; that would at least give the WO a chance to purge itself from structures it knows it's in ... > ... > BTW, how would this be done in JPython ? I guess it doesn't > make much sense there because cycles are no problem for the > Java VM GC. Weak refs have many uses beyond avoiding cycles, and Java 1.2 has all of "hard", "soft", "weak", and "phantom" references. See java.lang.ref for details. I stopped paying attention to Java, so it's up to you to tell us what you learn about it <wink>.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4