RetroSearch Browse

Sat May 7 02:15:27 CEST 2005 · https://mail.python.org/pipermail/python-dev/2005-May/053500.html

Shane Hathaway wrote:
> Ok.  Thanks for helping me understand where Python is WRT unicode.  I
> can work around the issues (or maybe try to help solve them) now that I
> know the current state of affairs.  If Python correctly handled UTF-16
> strings internally, we wouldn't need the UCS-4 configuration switch,
> would we?

Define correctly. Python, in ucs2 mode, will allow to address individual
surrogate codes, e.g. in indexing. So you get

>>> u"\U00012345"[0]
u'\ud808'

This will never work "correctly", and never should, because an efficient
implementation isn't possible. If you want "safe" indexing and slicing,
you need ucs4.

Regards,
Martin

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2005-May/053500.html below:

[Python-Dev] New Py_UNICODE doc