A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2008-December/084138.html below:

[Python-Dev] Python-3.0, unicode, and os.environ

[Python-Dev] Python-3.0, unicode, and os.environM.-A. Lemburg mal at egenix.com
Mon Dec 8 22:44:30 CET 2008
On 2008-12-08 22:32, Adam Olsen wrote:
> On Mon, Dec 8, 2008 at 2:01 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 2008-12-08 21:45, Antoine Pitrou wrote:
>>> M.-A. Lemburg <mal <at> egenix.com> writes:
>>>> Such application specific error handlers could then also apply
>>>> whatever fancy round-trip safe encoding of non-decodable bytes
>>>> to Unicode escapes, private code points, etc. as seen fit by the
>>>> application.
>>> I'd argue that such fancy round-trip safe error handler should be provided by
>>> Python. It's not reasonable to expect application coders to come up with their
>>> own codec variation based on subtle details of the unicode spec.
>> Fair enough. We could add some e.g.
>>
>>  * a round-trip safe escape error handler that uses a Unicode private
>>   code point area which we officially reserve for the Python
>>   interpreter
> 
> This would of course alter the behaviour of those private code points,
> preventing them from round-tripping properly.
> 
> I don't think round-tripping can be done from an error handler.  You
> need a full codec to do it.  A simple option is 8859-1.  Or, ya know,
> bytes.  This has long since gotten repetitive..

The error handler would just map the problem bytes to the private
area. The application would then have to decide what to do with
them, ie. the error handler only provides one half of the round-
tripping.

And that's on purpose: I don't believe we can come up with some magic
solution for the encodings problem. This is essentially something
that applications will have to solve on a case-by-case basis.

>>  * a human readable escape error handler that encodes the problem
>>   bytes to say hex escapes, e.g. gives Andr\xe9 for a Latin-1
>>   encoded directory name instead of failing
> 
> Similar to 'รถ'.encode('ascii', 'backslashreplace')?  I'm +1 on making that work.

Yes.

>>  * a warning error handler that replaces the problem cases with
>>   a question mark and issues a warning through the warning
>>   framework
> 
> I dub thee errors='warnreplace'.

Yep, something along those lines.

Perhaps there are more and better alternatives. These suggestions
are just to show how the idea could be put to some real-life use.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 08 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2008-12-02: Released mxODBC.Connect 1.0.0      http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4