On Mon, 22 May 2000, Guido van Rossum wrote: > > note that this leaves us with four string flavours in 1.6: > > > > - 8-bit binary arrays. may contain binary goop, or text in some strange > > encoding. upper, strip, etc should not be used. > > These are not strings. Indeed -- but at the moment, we're letting people continue to use strings this way, since they already do it. > > - 8-bit text strings using the system encoding. upper, strip, etc works > > as long as the locale is properly configured. > > > > - 8-bit unicode text strings. upper, strip, etc may work, as long as the > > system encoding is a subset of unicode -- which means US ASCII or > > ISO Latin 1. > > This is a figment of your imagination. You can use 8-bit text strings > to contain Latin-1, but you have to set your locale to match. I would like it to be only the latter, as Fred, i, and others have previously suggested, and as corresponds to your ASCII proposal for treatment of 8-bit strings. But doesn't the current locale-dependent behaviour of upper() etc. mean that strings are getting interpreted in the first way? > > is this complexity really worth it? > > From a backwards compatibility point of view, yes. Basically, > programs that don't use Unicode should see no change in semantics. I'm afraid i have to agree with this, because i don't see any other option that lets us escape from any of these four ways of using strings... -- ?!ng
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4