A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2009-April/089193.html below:

Non-decodable Bytes in System Character Interfaces

[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces [Python-Dev] PEP 383: Non-decodable Bytes in System Character InterfacesBaptiste Carvello baptiste13z at free.fr
Wed Apr 29 10:43:49 CEST 2009
Lino Mastrodomenico a écrit :
> 
> Only for the new utf-8b encoding (if Martin agrees), while the
> existing utf-8 is fine as is (or at least waaay outside the scope of
> this PEP).
> 

This is questionable. This would have the consequence that \udcxx in a python 
string would sometimes mean a surrogate, and sometimes mean raw bytes, depending 
on the history of the string.

By contrast, if the new utf-8b codec would *supercede* the old one, \udcxx would 
always mean raw bytes (at least on UCS-4 builds, where surrogates are unused). 
Thus ambiguity could be avoided.

Baptiste

More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4