> Hmm, shouldn't StringObjects themselves carry an encoding field > (defaulting to sys.encoding)? That approach has been discussed during the design phase of the Unicode API; Bill Janssen was the first to propose this in response to my talk http://www.python.org/workshops/1997-10/proceedings/loewis.html During the Unicode design, this idea came up sometimes, but it always turned out that proposers could not give a coherent semantics to such tags. Just explain what happens if you add two strings that have different encodings. > That would solve quite a fewb issues. And introduce many new ones. > > Making UTF-8 the default Python system encoding would have many other > > consequences -- and you'd probably lose a great deal of portability > > since UTF-8 conversion (nearly) always will succeed while ASCII can > > easily fail on other systems which use e.g. Latin-1 as native > > encoding. > > What are your reasons for asserting this? If I understand this claim correctly, he means: "Currently, if auto-conversion (to ASCII) succeeds, the result is likely correc. If the default encoding was UTF-8, conversion would succeed for all Unicode objects, but give incorrect results for many users, e.g. if they use Latin-1 on their terminal" This is actually a frequent problem since the introduction of UTF-8: Some applications display the bytes that make up an UTF-8 string as if it was a Latin-1 string, rendering it completely unreadable (although I can already recognize my name if I run into such an application). This problem may go unnoticed during testing, whereas an exception is likely noticed. > If I read this correctly this would make Python compatible to the > least common denominator of all platforms, while I think I would > prefer it to allow access to all the niceties a platform gives. It does no such thing. The application has full control over all conversions, if it initiates them explicitly. Explicit is better then implicit. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4