A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2008-December/084067.html below:

[Python-Dev] Python-3.0, unicode, and os.environ

[Python-Dev] Python-3.0, unicode, and os.environAdam Olsen rhamph at gmail.com
Sun Dec 7 18:35:53 CET 2008
On Sun, Dec 7, 2008 at 2:35 AM, Hagen Fürstenau <hfuerstenau at gmx.net> wrote:
>>> As far as I can see all Python Unicode strings can be encoded to UTF-8,
>>> even things like lone surrogates because Python doesn't care about them.
>>> So both the Unicode API and the binary API would be fail-safe on Windows.
>>
>> Python is broken and needs to be fixed.
>>
>> http://bugs.python.org/issue3672
>> http://bugs.python.org/issue3297
>
> But the question of whether Python should care about lone surrogates or
> not is at best tangential to the issue at hand.  If you have lone
> surrogates in the Unicode API (and didn't raise an exception on the way
> getting there), then the sensible thing is to encode them into lone
> UTF-8 surrogates.  Even if you wanted to prevent lone surrogates,
> encoding to UTF-8 for the binary API would not be the place to enforce it.

No.  Unicode *requires* them to be treated as errors.  If you want to
pass them through then you're creating a custom encoding... which you
might argue for in this case, but it needs to be clearly separate from
the real UTF-8.


-- 
Adam Olsen, aka Rhamphoryncus
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4