> > Guess which of these two statements is faster and by how much: > > > > list(itertools.repeat(None, 5000)) > > list(tuple(itertools.repeat(None, 5000))) > > I did guess, and was correct to 6 significant decimal digits <wink>. > > > The answer is important because it points the way to faster list > > comprehensions and other list building applications. > > That Python's builtin list calls realloc() on every append is pretty > gross, but it's also a tradeoff, saving the bytes in the list header > that would be required to remember how many allocated-but-unused > slots exist. If you've got a million tiny lists (and some apps do), > burning an additional 8MB for something your app doesn't really need > isn't attractive (on a 32-bit box today, the list header is 16 > bytes; pymalloc aligns to 8-byte boundaries, so the next possible > size is 24 bytes). list's very small overallocation (in the > ballpark of 5%) is also aimed at minimizing memory burden. Speed > isn't everything, tradeoffs are hard, and something loses when > something else is favored. (BTW, I never would have considered > implementing a resizable list type *without* keeping a field to > record the amount of overallocation, so this isn't the tradeoff I > would have made <wink>.) Just in case nobody had thought of it before, couldn't the realloc() call be avoided when roundupsize(oldsize) == roundupsize(newsize)??? > > ... > > And my interest is in a collections module which should probably > > published elsewhere (too much negativity on the subject here). > > A collections *package* is a fine idea. I say "package" instead of > "module" because there are literally hundreds of container types, > and queues and bags are a tiny corner of that space. An example > dear to my heart is BTrees (I maintain Zope's BTree implementation), > which is a wonderful way to get an associative mapping sorted by key > value. BTrees have a large API all by themselves, so it wouldn't > make sense to try to squash them into a module along with 50 other > container types. Right. +1 for a package. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4