RetroSearch Browse

Thu Aug 25 02:11:42 CEST 2011 · https://mail.python.org/pipermail/python-dev/2011-August/113060.html

Antoine Pitrou writes:
 > Le jeudi 25 août 2011 à 02:15 +0900, Stephen J. Turnbull a écrit :
 > > Antoine Pitrou writes:
 > >  > On Thu, 25 Aug 2011 01:34:17 +0900
 > >  > "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
 > >  > > 
 > >  > > Martin has long claimed that the fact that I/O is done in terms of
 > >  > > UTF-16 means that the internal representation is UTF-16
 > >  > 
 > >  > Which I/O?
 > > 
 > > Eg, display of characters in the interpreter.
 > 
 > I don't know why you say it's "done in terms of UTF-16", then. Unicode
 > strings are simply encoded to whatever character set is detected as the
 > terminal's character set.

But it's not "simple" at the level we're talking about!

Specifically, *in-memory* surrogates are properly respected when doing
the encoding, and therefore such I/O is not UCS-2 or "raw code units".
This treatment is different from sizing and indexing of unicodes,
where surrogates are not treated differently from other code points.

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2011-August/113060.html below:

[Python-Dev] PEP 393 Summer of Code Project