> From: "Fredrik Lundh" <effbot@telia.com> > > Peter Funk wrote: > > AFAIK locale and friends conform to POSIX.1. Calling this obsolescent... > > hmmm... may offend a *LOT* of people. Try this on comp.os.linux.advocacy ;-) > > you're missing the point -- now that we've added unicode support to > Python, the old 8-bit locale *ctype* stuff no longer works. while some > platforms implement a wctype interface, it's not widely available, and it's > not always unicode. Huh? We were talking strictly 8-bit strings here. The locale support hasn't changed there. > so in order to provide platform-independent unicode support, Python 1.6 > comes with unicode-aware and fully portable replacements for the ctype > functions. For those who only need Latin-1 or another 8-bit ASCII superset, the Unicode stuff is overkill. > the code is already in there... > > > On POSIX systems there are a several environment variables used to > > control the default locale settings for a users session. For example > > on my SuSE Linux system currently running in the german locale the > > environment variable LC_CTYPE=de_DE is automatically set by a file > > /etc/profile during login, which causes automatically the C-library > > function toupper('ä') to return an 'Ä' ---you should see > > a lower case a-umlaut as argument and an upper case umlaut as return > > value--- without having all applications to call 'setlocale' explicitly. > > > > So this simply works well as intended without having to add calls > > to 'setlocale' to all application program using this C-library functions. > > note that this leaves us with four string flavours in 1.6: > > - 8-bit binary arrays. may contain binary goop, or text in some strange > encoding. upper, strip, etc should not be used. These are not strings. > - 8-bit text strings using the system encoding. upper, strip, etc works > as long as the locale is properly configured. > > - 8-bit unicode text strings. upper, strip, etc may work, as long as the > system encoding is a subset of unicode -- which means US ASCII or > ISO Latin 1. This is a figment of your imagination. You can use 8-bit text strings to contain Latin-1, but you have to set your locale to match. > - wide unicode text strings. upper, strip, etc always works. > > is this complexity really worth it? From a backwards compatibility point of view, yes. Basically, programs that don't use Unicode should see no change in semantics. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4