Quoting Victor Stinner <victor.stinner at gmail.com>: > Slowly, I'm trying to see if it would be possible to reduce the memory > footprint of Python using the tracemalloc module. [...] > Should I open a separated issue for each idea to track them in the bug > tracker, or a global issue? There is a third alternative which I would recommend: not open tracker issues at all - unless you can also offer a patch. The things you find are not bugs per se, not even "issues". It is fine and applaudable that you look into this, but other people may have other priorities (like reimplementing the hash function of string objects). So if you remember that there is a potential for optimization, that may be enough for the moment. Or share it on python-dev (as you do below); people may be intrigued to look into this further, or ignore it. It's easy to ignore a posting to python-dev, but more difficult to ignore an issue on the tracker (*something* should be done about it, e.g. close with no action). > First, I noticed that linecache can allocate more than 2 MB. What do > you think of adding a registry of "clear cache" functions? For > exemple, re.purge() and linecache.clearcache(). gc.collect() clears > free lists. I don't know if gc.collect() should be related to this new > registy (clear all caches) or not. I'm -1 on this idea. There are some "canonical" events that could trigger clearance of caches, namely - out-of-memory situations - OS signals indicating memory pressure While these sound interesting in theory, they fail in practice. For example, they are very difficult to test. > The dictionary of interned Unicode strings can be large: up to 1.5 MB > (with +30,000 strings). Just the dictionary, excluding size of > strings. Is the size normal or not? Using tracemalloc, this dictionary > is usually to largest memory block. I'd check the contents of the dictionary. How many strings are in there; how many of these are identifiers; how many have more than one outside reference; how many are immortal? If there is a lot of strings that are not identifiers, some code possibly abuses interning, and should use its own dictionary instead. For the refcount-1 mortal identifiers, I'd trace back where they are stored, and check if many of them originate from the same module. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4