"M.-A. Lemburg" wrote: > > ... > > Oh, I think that everybody agrees on moving to Unicode as > basic text storage container. The last time we went around there was an anti-Unicode faction who argued that adding Unicode support was fine but making it the default would inconvenience Japanese users. > ... > Well, with -U on, Python will compile "" into u"", so you can > already test Unicode compatibility today... last I tried, Python > didn't even start up :-( I'm going to say again that I don't see that as a test of Unicode-compatibility. It is a test of compatibility with our existing Unicode object. If we simply allowed string objects to support higher character numbers I *cannot see* how that could break existing code. > ... > We can use that knowledge to base future design upon. The problem > with many stdlib modules is that they don't make a difference > between text and binary data (and often can't, e.g. take sockets), > so we'll have to figure out a way to differentiate between the > two. We'll also need an easy-to-use binary data type -- as you > mention in the PEP, we could take the old string implementation > as basis and then perhaps turn u"" into "" and use b"" to mean > what "" does now (string object). I agree that we need all of this but I strongly disagree that there is any dependency relationship between improving the Unicode-awareness of I/O routines (sockets and files) and allowing string objects to support higher character numbers. I claim that allowing higher character numbers in strings will not break socket objects. It might simply be the case that for a while socket objects never create these higher charcters. Similarly, we could improve socket objects so that they have different readtext/readbinary and writetext/writebinary without unifying the string objects. There are lots of small changes we can make without breaking anything. One I would like to see right now is a unification of chr() and unichr(). We are just making life harder for ourselves by walking further and further down one path when "everyone agrees" that we are eventually going to end up on another path. > ... It would be nice if we could avoid > adding more conversion magic... We already have more "magic" in our conversions than we need. I don't think I'm proposing any new conversions. Paul Prescod
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4