> [/F proves beyond a shadow of a doubt that string.whitespace is > locale-sensitive] > > Thanks, Fredrik! That clarifies the behaviour Just is seeing. > > Hey: I just realized that making textwrap trust string.whitespace is > wrong in at least one case: 0xa0 is *non-breaking* space in ISO-8859-1, > and converting it to 0x20 (regular ol' space) is clearly wrong -- the > "non-break" request will be ignored. So Unicode or not, textwrap should > probably just hard-code the US-ASCII whitespace chars. +1 > My attitude is that textwrap should work on European languages, whether > they are encoded in 8-bit "ASCII" or Unicode. I suspect that passing an > arbitrary Unicode string to it is meaningles -- what the heck does it > even mean to wrap a string of Chinese or Hebrew or Devangari characters? > Beats me, and I think they're out of scope for textwrap. Correct -- you can't trust the width of characters to be all the same. (I'm not even sure if that's true for Latin-1, Cyrillic or Greek, but it seems likely.) > So: do I even need to worry about the cornucopia of Unicode whitespace > characters at all? Or can I sweep that can of worms under the rug? > (Pardon the horribly mixed metaphor.) Please shove them under the garage. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4