A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2009-April/089133.html below:

[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces [Python-Dev] PEP 383: Non-decodable Bytes in System Character InterfacesHrvoje Niksic hrvoje.niksic at avl.com
Tue Apr 28 15:06:17 CEST 2009
Lino Mastrodomenico wrote:
> Since this byte sequence [b'\xed\xb3\xbf'] doesn't represent a valid character when
> decoded with UTF-8, it should simply be considered an invalid UTF-8
> sequence of three bytes and decoded to '\udced\udcb3\udcbf' (*not*
> '\udcff').

"Should be considered" or "will be considered"?  Python 3.0's UTF-8 
decoder happily accepts it and returns u'\udcff':

 >>> b'\xed\xb3\xbf'.decode('utf-8')
'\udcff'

If the PEP depends on this being changed, it should be mentioned in the PEP.
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4