At pycon, I have been looking into Python startup time. I found that CVS-Python allocates roughly 12,000 string objects on startup, whereas Python 2.2 only allocates 8,000 string objects. In either case, most strings come from unmarshalling string objects, and the increase is (probably) due to the increased number of modules loaded at startup (up from 26 to 34). The string objects allocated during unmarshalling are often quickly discarded after being allocated, as they are identifiers, and get interned - so only the interned version of the string survives, and the second copy is deallocated. I'd like to change the marshal format to perform sharing of equal strings, instead of marshalling the same identifiers multiple times. To do so, a dictionary of strings is created on marshalling and a list is created on unmarshalling, and a new marshal code for string-backreference would be added. What do you think? Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4