A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2010-November/105909.html below:

[Python-Dev] len(chr(i)) = 2?

[Python-Dev] len(chr(i)) = 2? [Python-Dev] len(chr(i)) = 2?Amaury Forgeot d'Arc amauryfa at gmail.com
Tue Nov 23 20:19:28 CET 2010
2010/11/23 Alexander Belopolsky <alexander.belopolsky at gmail.com>:
> This discussion motivated me to start looking into how well Python
> library itself is prepared to deal with len(chr(i)) = 2.  I was not
> surprised to find that textwrap does not handle the issue that well:
>
>>>> len(wrap(' \U00010140' * 80, 20))
> 12
>>>> len(wrap(' \U00000140' * 80, 20))
> 8
>
> That module should probably be rewritten to properly implement  the
> Unicode line breaking algorithm
> <http://unicode.org/reports/tr14/tr14-22.html>.
>
> Yet finding a bug in a str object method after a 5 min review was a
> bit discouraging:
>
>>>> 'xyz'.center(20, '\U00010140')
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> TypeError: The fill character must be exactly one character long
>
> Given the apparent difficulty of writing even basic text processing
> algorithms in presence of surrogate pairs, I wonder how wise it is to
> expose Python users to them.

This was already discussed two years ago:

http://mail.python.org/pipermail/python-dev/2008-July/080900.html

So yes, wrap() and center() should be fixed.

--
Amaury Forgeot d'Arc
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4