[Guido] > This suggests that zip() can be sped up considerably by using a hint > for the input sequence sizes. If they all respond properly to > PySequence_Length(), zip can preallocate the result list to the > correct size. It shouldn't believe the numbers fully, but using them > as a hint for initial allocation makes a lot of sense, and should > greatly reduce the inefficiency. I'm afraid I misanalyzed the cause. Giving zip a good guess about the result size and using PyList_SET_ITEM up until then leads (as expected) to no list copying in the test case I posted, but only dropped the runtime from 7.87 to 7.31 seconds (which makes sense looking at the numbers more closely: it doesn't take *that* long to copy a few dozen mebabytes of memory unless you're doing it by hand <wink>). Check this out: """ from time import clock as now n = 1000000 def juststore(x): for i in x: x[i] = 0 def justtups(x): for i in x: 0, def storetups(x): for i in x: x[i] = 0, def run(func): x = range(n) start = now() func(x) finish = now() print "%10s %6.2f" % (func.__name__, finish - start) run(juststore) run(justtups) run(storetups) """ I get: juststore 0.93 justtups 0.58 storetups 7.61 list.append is out of the picture here. Creating a million tuples goes fast so long as they're recycled, and storing a million things goes fast, but storing a million distinct tuples takes very much longer. It's a Mystery to me so far. How does it act on Linux?
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4