[Neil Schemenauer] > A table of keys and values. Values are looked up by looping over > the table comparing each key until the correct one is found (ie. > its O(n) where n is the size of the table). For Python, the cost > of doing compares probably outweighs the cost of doing the > hashing, even for small tables. I thought about that before. The inlining appeals but the algorithm not much: the dict implementation *as is* loops over all the table entries too, except that instead of starting with "i = 0" it starts (now) with "i = hash & mask"; instead of incrementing via "++i" it does "i <<= 1; if (i > mask) i ^= poly"; and instead of giving up when "i >= length" it punts when finding an entry with a null value. Incrementing via ++i is certainly cheaper, except that even when small, the hash table usually hits on the first try when the key is present, so usually gets out before incrementing. > Its not clear to me though if it would be a win. Best guess is not. > Assuming that interned strings are the most common key, a assocation > table with four entries would take on average two pointer compares > to look up a value. Actually an average of 2.5 when the key is present and each key is equally likely to be queried, and always 4 when the queried key is not present. The hash table has better expected stats on both counts, but needs 4 unused slots too to achieve that. The savings would be in memory for small dicts more than in time (if at all).
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4