On Tue, Jan 27, 2009 at 10:43 AM, "Martin v. Löwis" <martin at v.loewis.de> wrote: >> Interning the strings on unpickling makes the pickles smaller, and at >> least for cPickle actually makes unpickling sequences of many objects >> slightly faster. I have included proposed patches to cPickle.c and >> pickle.py, and would appreciate any feedback. > > Please submit patches always to the bug tracker. > > On the proposed change: While it is fairly unintrusive, I would like to > propose a different approach - pickle interned strings special. The > marshal module already uses this approach, and it should extend to > pickle (although it would probably require a new protocol). > > On pickling, inspect each string and check whether it is interned. If > so, emit a different code, and record it into the object id dictionary. > On a second occurrence of the string, only pickle a backward reference. > (Alternatively, check whether pickling the same string a second time > would be more compact). > > On unpickling, support the new code to intern the result strings; > subsequent references to it will go to the standard backreferencing > algorithm. Hm. This would change the pickling format though. Wouldn't just interning (short) strings on unpickling be simpler? -- --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4