Tim Peters wrote: > > [Jeremy] > > I also did a profile run on CreateInstances, which has a difference of > > +55.54% on my machine. It's basically the same story. The instance > > dictionary is getting resized more often with Python 2.1+ than it did > > with Python 1.5.2. I wouldn't be surprised if several more tests are > > showing a slowdown with the same cause. > > > > So boosting the minimum size sounds like a good thing. > > I don't know. PyBench is great for showing that *something* changed, but > it's got even less claim to "typical use" than pystone. It doesn't claim "typical use". pybench is aimed at finding out performance issues about hot-spots -- there's no such thing as a "typical program", so pybench gives you low level performance compares for very specific tasks, e.g. dictionary creation or for-loop performance. I have found it to be rather successful at that. At least gives some good hints at where to look... > I don't know that the test suite is better in that respect, but it's got much > more variety and everyone has it <wink>. I stuffed code in dict_dealloc() to > record the ma_fill of each dict on its way to the grave (ma_fill == number of > non-virgin slots). Across the test suite, here's the ranking, from most to > least popular fill: > > count fill %total cumulative % > ------ ---- ------ ------------ > 146321 1 53.30 53.30 > 38200 0 13.91 67.21 > 32616 2 11.88 79.09 > 29648 3 10.80 89.89 > 9884 5 3.60 93.49 > 5423 4 1.98 95.47 > 2428 6 0.88 96.35 > 2016 8 0.73 97.08 > 1179 7 0.43 97.51 > 904 9 0.33 97.84 > 709 103 0.26 98.10 > 554 10 0.20 98.30 > 513 13 0.19 98.49 > 459 12 0.17 98.66 > 447 11 0.16 98.82 > 364 14 0.13 98.95 > 233 15 0.08 99.04 > 231 16 0.08 99.12 > 193 18 0.07 99.19 > 180 17 0.07 99.26 > 122 19 0.04 99.30 > 107 30 0.04 99.34 > 105 21 0.04 99.38 > 93 22 0.03 99.41 > 93 20 0.03 99.45 > 86 256 0.03 99.48 > 82 23 0.03 99.51 > 80 26 0.03 99.54 > 74 24 0.03 99.56 > 69 27 0.03 99.59 > 64 25 0.02 99.61 > 60 29 0.02 99.63 > 49 28 0.02 99.65 > 44 34 0.02 99.67 > 33 32 0.01 99.68 > 28 31 0.01 99.69 > 27 37 0.01 99.70 > 27 33 0.01 99.71 > 26 35 0.01 99.72 > 24 36 0.01 99.73 > 23 39 0.01 99.74 > 23 38 0.01 99.75 > 21 128 0.01 99.75 > 19 44 0.01 99.76 > 19 40 0.01 99.77 > 17 46 0.01 99.77 > 16 48 0.01 99.78 > 15 47 0.01 99.78 > 14 50 0.01 99.79 > 14 42 0.01 99.79 > > There are many more sizes, but I cut off the display here when they got too > rare to round to 1% of 1% of the total count. > > Boosting the first non-empty size to 8 would allow 93+% of all dicts to get > away with at most one resize (a dict of size 8 is enough for a fill of 5, but > not 6). OTOH, the current first non-empty size of 4 is enough for 79% of all > dicts (enough for a fill of 2, but not 3). If oodles of those tiny dicts are > alive *at the same time*, it would be quite a waste of space to force the > non-empty ones to carry 8 slots. OTOH, if those small dicts are due to > things like building one- or two-element keyword argument dicts, their > lifetimes rarely overlap. I found that instance dictionaries are usual within the 8 slot range. You normally have a few heavy wheight instances and many light wheight ones which only have two or three attributes in their instance dict. > A more aggressive idea is to allow denser dicts, by allowing them to become > no more than 75% full. That is, change the resize test from > > mp->ma_fill*3 >= mp->ma_size*2 > > to > > mp->ma_fill*4 > mp->ma_size*3 > > That would allow the 10.8% of real(er) life dicts with fill 3 to continue > living in dicts with 4 slots, and allow about 90% of all dicts to get away > with no more than one resize. The downside is that boosting the max load > factor from 2/3 to 3/4 yields, "in theory", and for a dict hugging the limit, > a small boost in the expected # of compares. But the "theory" is for random > hash functions with "uniform probing" (tech term that does *not* mean linear > probing), and Python's hash functions often aren't random at all, while AFAIK > no rigorous analysis of its probing strategy exists. > > So, plenty of arbitrary data there upon which to flip a coin <wink>. Why not make those parameters macros at the top of dictobject.c which can then be tuned to whatever the programmer needs/wants ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4