A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2009-April/089151.html below:

Non-decodable Bytes in System Character Interfaces

[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces [Python-Dev] PEP 383: Non-decodable Bytes in System Character InterfacesZooko O'Whielacronx zooko at zooko.com
Tue Apr 28 20:51:43 CEST 2009
On Apr 28, 2009, at 6:46 AM, Hrvoje Niksic wrote:

> Are you proposing to unconditionally encode file names as  
> iso8859-15, or to do so only when undecodeable bytes are encountered?

For what it is worth, what we have previously planned to do for the  
Tahoe project is the second of these -- decode using some 1-byte  
encoding such as iso-8859-1, iso-8859-15, or windows-1252 only in the  
case that attempting to decode the bytes using the local alleged  
encoding failed.

> If you switch to iso8859-15 only in the presence of undecodable  
> UTF-8, then you have the same round-trip problem as the PEP: both  
> b'\xff' and b'\xc3\xbf' will be converted to u'\u00ff' without a  
> way to unambiguously recover the original file name.

Why do you say that?  It seems to work as I expected here:

 >>> '\xff'.decode('iso-8859-15')
u'\xff'
 >>> '\xc3\xbf'.decode('iso-8859-15')
u'\xc3\xbf'
 >>>
 >>>
 >>>
 >>> '\xff'.decode('cp1252')
u'\xff'
 >>> '\xc3\xbf'.decode('cp1252')
u'\xc3\xbf'

Regards,

Zooko
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4