> Here's a quick pre-PEP. > > > Allow mutable objects as dictionary keys in certain cases by adding a > new protocol. dict.__setitem__() and __contains__() would work like this > bit of pseudo code: > > def __setitem__(self, key, value): > if not hashable(key): > if hasattr(key, "__immutable_copy__"): > key = key.__immutable_copy__() > else: > raise TypeError, "unhashable key" > ...insert value with key into dict... > > def __contains__(self, key): > if not hashable(key): > if hasattr(key, "__temporary_immutable__"): > key = key.__temporary_immutable__() > else: > raise TypeError, "unhashable key" > ...test for key membership... > > (other methods would be similar) > > When implemented in C in dictobject.c, this would add no overhead for > the "normal" case (when keys _are_ directly hashable). I have a small doubt about this claim. Inserting code into PyDict_SetItem(), even if it is normally never executed, may reduce code locality and hence cause the code to miss the cache more often than before. This *may* be addressed by rearranging the code, but there's a limit to that. I have a countersuggestion. Could you do this as a subclass of dict? A subclass in Python would be inefficient, but might still be good enough for your second use case (the ObjC bridge). If not, a subclass in Python might be feasible -- it would be a little bit more work than integrating it into dictobject.c, but you have a lot less convincing to do, and you can even get it to work with Python 2.2. > The __temporary_immutable__ part of the protocol is needed to avoid > unneccesary copying in those cases where the key isn't actually inserted > into the dict, but is only used for membership testing. It would return > a wrapper around the mutable object, defining __eq__ and __hash__. (This > idea is stolen from the sets.py module.) If an implementer doesn't care > about this efficiency, __temporary_immutable__ can be an alias to > __immutable_copy__. > > > Use Cases: > > - sets.py could be written more efficiently and/or could be more easily > be rewritten in C. > > - Again, the troubles of bridging Python to Objective-C have triggered > this idea. Due to the nature of Objective-C in general, and Cocoa in > particular, we can never be sure beforehand whether a string is > mutable or not. We can detect mutability at runtime, but depending on > this is dangerous as it exposes an implementation detail that may > change between releases of _any_ API that returns strings. Therefore > code that works in one setup may break in another, or even in the the > same setup after an arbitrary library upgrade, simply because an API > suddenly returns a mutable string where it didn't before. Yes this > sucks, but it's the way it is. <Add link to post from bbum explaining > this situation in detail> > > > Backwards compatibilty: > > 100%. > > Overhead added > > None > > > Implementation > > There's quite a bit of code duplication in those areas of dictobject.c > that matter for this proposal, so implementing it will take longer than > the 10 minutes I originally estimated <wink>. Perhaps this is a good > opportunity to refactor these bits into inline functions? Not being a C > guru I can't judge whether this is possible or desirable with the set of > compilers that Python must be built with. You can't rely on inline. OTOH, GCC -O3 often inlines static functions without needing a hint. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4