On 15.09.16 19:13, Antoine Pitrou wrote: > Since this micro-benchmark creates the keys in order just before > filling the dict with them, randomizing the insertion order destroys > the temporal locality of object header accesses when iterating over the > dict keys. *This* looks like the right explanation, not branch > mispredicts due to NULL checks. > > This also shows that a micro-benchmark that merely looks ok can actually > be a terrible proxy of actual performance. Thanks you for great explanation Antoine! I came to the same conclusions about randomized integers example, but didn't notice that this also is a main cause of the speed up of strings example. > As a further validation of this theory, let's dramatically decrease the > working set size on the initial benchmark: > > $ ./python -m timeit -s "d=dict.fromkeys(map(str,range(10**3)))" > "list(d)" > > -> Python 3.5: 100000 loops, best of 3: 10.9 usec per loop > -> Python 3.6: 100000 loops, best of 3: 9.72 usec per loop > > When the working set fits in the cache, this micro-benchmark is > only 12% slower on 3.5 compared to 3.6. > *This* much smaller difference (a mere 1.2ns difference per dict > element) could be attributed to eliminating the NULL checks, or to any > other streamlining of the core iteration logic. Yet one example, with random hashes and insertion order independent from the creation order. $ ./python -m timeit -s "import random; a = list(map(str, range(10**6))); random.shuffle(a); d = dict.fromkeys(a)" -- "list(d)" Python 3.5: 180, 180, 180 msec per loop Python 3.6: 171, 172, 171 msec per loop Python 3.6 is 5% faster and this looks closer to the actual performance.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4