Bill Tutt wrote: > > > MAL wrote: > > ... I wonder why compiling "print u'\uD800'" causes the > > hash value to be computed ... > > That's an easy one. Com_addconst() (or something it calls) calls > PyObject_Hash() during the compilation process. Ah ok. > Re: UTF-8 > There's no reason why you can't support surrogates in UTF-8, while still not > supporting them in slice notation. True. > It's certainly the easiest way to fix the problem. Well, it doesn't really fix the problem... your note only made it clear that with the change in default encoding (be it ASCII or whatever the locale defines), has the unwanted side effect of breaking the has/cmp rule for non-ASCII character strings vs. Unicode. Perhaps pushing the default encoding down all the way is the solution (with some trickery this is possible now, since changing the default encoding is only allows in site.py) or simply stating that the hash/cmp rule only works for ASCII contents of the objects. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4