> For the record: my view of Unicode is really "ascii done right", i.e. a > datatype that allows you to get richer characters than what 1960s ascii > gives you. Exactly, with the stress on *ASCII*. Almost everybody could agree on ASCII; it is the 8-bit character sets where the troubles start. > For this it should be as backward-compatible as possible, i.e. if > some API expects a unicode filename and I pass "a.out" it should > interpret it as u"a.out". That works fine with the current API. > All the converting to different charsets is icing on the cake, the > number one priority should be that unicode is as compatible as > possible with the 8-bit convention used on the platform (whatever it > may be). The problem is that there are multiple conventions on many systems, and only the application can know which of these to apply. > Using Python StringObjects as binary buffers is also far less common > than using StringObjects to store plain old strings, so if either of > these uses bites the other it's the binary buffer that needs to > suffer. This is a conclusion I cannot agree with. Most strings are really binary, if you look at them closely enough :-) > UnicodeObjects and StringObjects should behave pretty orthogonal to > how FloatObjects and IntObjects behave. For the Python programmer: yes; For the C programmer: memory management makes that inherently difficult, which you don't have for int vs float. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4