[Bruce Christensen] > We seem to have stumbled upon some strange behavior in cPickle's memo > use when pickling instances. > > Here's the repro: > > [mymodule.py] > class C: > def __getstate__(self): return ('s1', 's2', 's3') > > [interactive interpreter] > Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import cPickle > >>> import mymodule > >>> class C: > ... def __getstate__(self): return ('s1', 's2', 's3') > ... > >>> for x in mymodule.C(), C(): cPickle.dumps(x) > ... > "(imymodule\nC\np1\n(S's1'\nS's2'\np2\nS's3'\np3\ntp4\nb." > "(i__main__\nC\np1\n(S's1'\nS's2'\nS's3'\ntp2\nb." > >>> > > Note that the second and third strings in the instance's state are > memoized in the first case, but not in the second. Any idea why this > occurs (and why the first element is never memoized)? Ideally, a pickle would never contain a `PUT i` unless i was referenced by a `GET i` later. So, ideally, there would be no PUT opcodes in either of these pickles. cPickle is a little bit smarter than pickle.py here, in that cPickle suppresses a PUT if the reference count on the object is less than 2 (in which case the structure being pickled can't possibly reference the sub-object a second time, so it's impossible that a later GET will want to reference the same sub-object). So all you're seeing here is refcount accidents, complicated by accidents concerning exactly which strings get interned. Use pickle.py instead (which doesn't do this refcount micro-optimization), and you'll see the same number of PUTs in both. They're all correct. What would be incorrect is seeing a `GET i` without a preceding `PUT i` using the same `i`.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4