RetroSearch Browse

Wed Nov 24 00:49:42 CET 2010 · http://mail.python.org/pipermail/python-dev/2010-November/105939.html

Alexander Belopolsky wrote:

> """
> Because the most commonly used characters are all in the Basic
> Multilingual Plane, converting between surrogate pairs and the
> original values is often not tested thoroughly. This leads to
> persistent bugs, and potential security holes, even in popular and
> well-reviewed application software.
> """

Maybe Python should have used UTF-8 as its internal unicode
representation. Then people who were foolish enough to assume
one character per string item would have their programs break
rather soon under only light unicode testing. :-)

-- 
Greg

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/2010-November/105939.html below:

[Python-Dev] len(chr(i)) = 2?