> it is real. I won't repeat the arguments one more time; please read > the W3C character model note and the python-dev archives, and read > up on the unicode support in Tcl and Perl. I did read all that, so there really is no point in repeating the arguments - yet I'm still not convinced. One of the causes may be that all your commentary either - discusses an alternative solution to the existing one, merely pointing out the difference, without any strong selling point - explains small examples that work counter-intuitively I'd like to know whether you have an example of a real-world big-application problem that could not be conveniently implemented using the new Unicode API. For all the examples I can think where Unicode would matter (XML processing, CORBA wstring mapping, internationalized messages and GUIs), it would work just fine. So while it may not be perfect, I think it is good enough. Perhaps my problem is that I'm not a perfectionist :-) However, one remark from http://www.w3.org/TR/charmod/ reminded me of an earlier proposal by Bill Janssen. The Character Model says # Because encoded text cannot be interpreted and processed without # knowing the encoding, it is vitally important that the character # encoding is known at all times and places where text is exchanged or # stored. While they were considering document encodings, I think this applies in general. Bill Janssen's proposal was that each (narrow) string should have an attribute .encoding. If set, you'll know what encoding a string has. If not set, it is a byte string, subject to the default encoding. I'd still like to see that as a feature in Python. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4