Greg Ward <gward@python.net> writes: > My attitude is that textwrap should work on European languages, whether > they are encoded in 8-bit "ASCII" or Unicode. Please, don't assume any specific encoding. Why is Latin-1 better than KOI8-R? The only encoding that is truly better than all others is ASCII, since virtually all other encodings have ASCII as a subset (except for the EBCDIC ones, and, with limitations, the ISO-2022 ones). Also, you'll find more-and-more European languages encoded in UTF-8, so your support would be useless and give wrong results. [If you meant to suggest no specific processing for disregard this comment] > I suspect that passing an arbitrary Unicode string to it is > meaningles -- what the heck does it even mean to wrap a string of > Chinese or Hebrew or Devangari characters? Beats me, and I think > they're out of scope for textwrap. Actually, the Unicode database has "line-breaking properties". Those are not yet incorporated into unicodedata, but that could be used to meaningfully extend the module to Unicode. > So: do I even need to worry about the cornucopia of Unicode whitespace > characters at all? Or can I sweep that can of worms under the rug? > (Pardon the horribly mixed metaphor.) Sweep away. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4