On Tue, Jan 17, 2012 at 12:59 PM, "Martin v. Löwis" <martin at v.loewis.de>wrote: > I'd like to propose a different approach to seeding the string hashes: > only do so for dictionaries involving only strings, and leave the > tp_hash slot of strings unchanged. > > Each string would get two hashes: the "public" hash, which is constant > across runs and bugfix releases, and the dict-hash, which is only used > by the dictionary implementation, and only if all keys to the dict are > strings. In order to allow caching of the hash, all dicts should use > the same hash (if caching wasn't necessary, each dict could use its own > seed). > > There are several variants of that approach wrt. caching of the hash > 1. add an additional field to all string objects, to cache the second > hash value. > yuck, our objects are large enough as it is. > a) variant: in 3.3, drop the extra field, and declare that hashes > may change across runs > +1 Absolutely. We can and should make 3.3 change hashes across runs (behavior that can be disabled via a flag or environment variable). I think the issue of doctests and such breaking even in 2.7 due to hash order changes is a being overblown. Code like that has already needs to fix its tests at least once when they want tests to pass on on both 32-bit and 64-bit python VMs (they have different hashes). Do we have _any_ measure of how big a deal this will be before going too far here? -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20120117/387bfbe9/attachment.html>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4