Antoine Pitrou wrote: > Michael Haggerty <mhagger <at> alum.mit.edu> writes: >> It is easy to optimize the pickling of instances by giving them >> __getstate__() and __setstate__() methods. But the pickler still >> records the type of each object (essentially, the name of its class) in >> each record. The space for these strings constituted a large fraction >> of the database size. > > If these strings are not interned, then perhaps they should be. > There is a similar optimization proposal (w/ patch) for attribute names: > http://bugs.python.org/issue5084 If I understand correctly, this would not help: - on writing, the strings are identical anyway, because they are read out of the class's __name__ and __module__ fields. Therefore the Pickler's usual memoizing behavior will prevent the strings from being written more than once. - on reading, the strings are only used to look up the class. Therefore they are garbage collected almost immediately. This is a different situation that that of attribute names, which are stored persistently as the keys in the instance's __dict__. Michael
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4