Guido van Rossum wrote: > > > Here's what I'll do: > > > > * implement .capitalize() in the traditional way for Unicode > > objects (simply convert the first char to uppercase) > > * implement u.title() to mean the same as Java's toTitleCase() > > * don't implement s.title(): the reasoning here is that it would > > confuse the user when she get's different return values for > > the same string (titlecase chars usually live in higher Unicode > > code ranges not reachable in Latin-1) > > Huh? For ASCII at least, titlecase seems to map to ASCII; in your > current implementation, only two Latin-1 characters (u'\265' and > u'\377', I have no easy way to show them in Latin-1) map outside the > Latin-1 range. You're right, sorry for the confusion. I was thinking of other encodings like e.g. cp437 which have corresponding characters in the higher Unicode ranges. > Anyway, I would suggest to add a title() call to 8-bit strings as > well; then we can do away with string.capwords(), which does something > similar but different, mostly by accident. Ok, I'll do it this way then: s.title() will use C's toupper() and tolower() for case mapping and u.title() the Unicode routines. This will be in sync with the rest of the 8-bit string world (which is locale aware on many platforms AFAIK), even though it might not return the same string as the corresponding u.title() call. u.capwords() will be disabled in the Unicode implemetation... it wasn't even implemented for the string implementetation, so there's no breakage ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4