RetroSearch Browse

Fri May 6 21:42:16 CEST 2005 · https://mail.python.org/pipermail/python-dev/2005-May/053473.html

On May 6, 2005, at 2:49 PM, Nicholas Bastin wrote:
> If this is the case, then we're clearly misleading users.  If the
> configure script says UCS-2, then as a user I would assume that
> surrogate pairs would *not* be encoded, because I chose UCS-2, and it
> doesn't support that.  I would assume that any UTF-16 string I would
> read would be transcoded into the internal type (UCS-2), and
> information would be lost.  If this is not the case, then what does the
> configure option mean?

It means all the string operations treat strings as if they were UCS-2, 
but that in actuality, they are UTF-16. Same as the case in the windows 
APIs and Java. That is, all string operations are essentially broken, 
because they're operating on encoded bytes, not characters, but claim 
to be operating on characters.

James

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2005-May/053473.html below:

[Python-Dev] New Py_UNICODE doc