> Guido van Rossum wrote: > > I couldn't have said it better. It's okay for now to have it > > changeable at the C level -- with endless caveats that it should be > > set only once before any use, and marked as an experimental feature. > > But the Python access and the reliance on the environment should go. [MAL replies] > Sorry, but I'm really surprised now: I've put many hours of > work into this, hacked up encoding support for locale.py, > went through endless discussions, proposed the changable default > as compromise to make all parties (ASCII, UTF-8 and Latin-1) happy > ... and now all it takes is one single posting to render all that > work useless ??? I'm sorry too. As Fred Drake explained, the changeable default was an experiment. I won't repeat his excellent response. I am perhaps to blame for the idea that the character set of 8-bit strings in C can be derived in some whay from the locale -- but the main reason I brought it up was as a counter-argument to the Latin-1 fixed default that effbot arged for. I never dreamed that you could actually find out the name of the character set given the locale! > Instead of tossing things we should be *constructive* and come > up with a solution to the hash value problem, e.g. I would > like to make the hash value be calculated from the UTF-16 > value in a way that is compatible with ASCII strings. I think you are proposing to drop the following rule: if a == b then hash(a) == hash(b) or also if hash(a) != hasb(b) then a != b This is very fundamental for dictionaries! Note that it is currently broken: >>> d = {'\200':1} >>> d['\200'] 1 >>> u'\200' == '\200' 1 >>> d[u'\200'] Traceback (most recent call last): File "<stdin>", line 1, in ? KeyError: >>> While you could fix this with a variable encoding, it would be very hard, probably involving the string to Unicode before taking its hash, and this would slow down the hash calculation for 8-bit strings considerably (and these are fundamental for the speed of the language!). So I am for restoring ASCII as the one and only fixed encoding. (Then you can fix your hash much easier!) Side note: the KeyError handling is broken. The bad key should be run through repr() (probably when the error is raised than when it is displayed). --Guido van Rossum (home page: http://dinsdale.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4