I've experimented with about a dozen ways to improve dictionary performance and found one that benefits some programs by up to 5% without hurting the performance of other programs by more than a single percentage point. It entails a one line change to dictobject.c resulting in a new schedule of dictionary sizes for a given number of entries: Number of Current size Proposed size Filled Entries of dictionary of dictionary -------------- ------------- ------------- [-- 0 to 5 --] 8 8 [-- 6 to 10 --] 16 32 [-- 11 to 21 --] 32 32 [-- 22 to 42 --] 64 128 [-- 43 to 85 --] 128 128 [-- 86 to 170 --] 256 512 [-- 171 to 341 --] 512 512 The idea is to lower the average sparseness of dictionaries (by 0% to 50% of their current sparsenes). This results in fewer collisions, faster collision resolution, fewer memory accesses, and better cache performance. A small side-benefit is halving the number of resize operations as the dictionary grows. The above table of dictionary sizes shows that odd numbered steps have the same size as the current approach while even numbered steps are twice as large. As a result, small dicts keep their current size and the amortized size of large dicts remains the same. Along the way, some dicts will be a little larger and will benefit from the increased sparseness. I would like to know what you guys think about the idea and would appreciate your verifying the performance on your various processors and operating systems. Raymond Hettinger P.S. The one line patch is: *** dictobject.c 17 Mar 2003 19:46:09 -0000 2.143 --- dictobject.c 25 Apr 2003 22:33:24 -0000 *************** *** 532,538 **** * deleted). */ if (mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2) { ! if (dictresize(mp, mp->ma_used*2) != 0) return -1; } return 0; --- 532,538 ---- * deleted). */ if (mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2) { ! if (dictresize(mp, mp->ma_used*4) != 0) return -1; }
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4