Fredrik Lundh wrote: > > M.-A. Lemburg wrote: > > The UTF-8 assumption had to be made in order to get the two > > worlds to interoperate. We could have just as well chosen > > Latin-1, but then people currently using say a Russian > > encoding would get upset for the same reason. > > > > One way or another somebody is not going to like whatever > > we choose, I'm afraid... the simplest solution is to use > > Unicode for all strings which contain non-ASCII characters > > and then call .encode() as necessary. > > just a brief head's up: > > I've been playing with this a bit, and my current view is that > the current unicode design is horridly broken when it comes > to mixing 8-bit and 16-bit strings. Why "horribly" ? String and Unicode mix pretty well, IMHO. The magic auto-conversion of Unicode to UTF-8 in C APIs using "s" or "s#" does not always do what the user expects, but it's still better than not having Unicode objects work with these APIs at all. > basically, if you pass a uni- > code string to a function slicing and dicing 8-bit strings, it > will probably not work. and you will probably not under- > stand why. > > I'm working on a proposal that I think will make things simpler > and less magic, and far easier to understand. to appear on > sunday. Looking forward to it, -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4