A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2010-June/100867.html below:

[Python-Dev] bytes

[Python-Dev] bytes / unicodeTerry Reedy tjreedy at udel.edu
Tue Jun 22 01:55:59 CEST 2010
On 6/21/2010 1:29 PM, Guido van Rossum wrote:

> Actually, the big problem with Python 2 is that if you mix str and
> unicode, things work or crash depending on whether any of the str
> objects involved contain non-ASCII bytes.
>
> If one API decides to upgrade to Unicode, the result, when passed to
> another API, may well cause a UnicodeError because not all arguments
> have had the same treatment.
>
>> Now, the APIs are neither safe nor aware -- if you pass bytes in, you get
>> unpredictable results back.
>
> This seems an overgeneralization of a particular bug. There are APIs
> that are strictly text-in, text-out. There are others that are
> bytes-in, bytes-out. Let's call all those *pure*. For some operations
> it makes sense that the API is *polymorphic*, with which I mean that
> text-in causes text-out, and bytes-in causes byte-out. All of these
> are fine.
>
> Perhaps there are more situations where a polymorphic API would be
> helpful. Such APIs are not always so easy to implement, because they
> have to be careful with literals or other constants (and even more so
> mutable state) used internally -- but it can be done, and there are
> plenty of examples in the stdlib.
>
> The real problem apparently lies in (what I believe is only a few
> rare) APIs that are text-or-bytes-in and always-text-out (or
> always-bytes-out). Let's call them *hybrid*. Clearly, mixing hybrid
> APIs in a stream of pure or polymorphic API calls is a problem,
> because they turn a pure or polymorphic overall operation into a
> hybrid one.
>
> There are also text-in, bytes-out or bytes-in, text-out APIs that are
> intended for encoding/decoding of course, but these are in a totally
> different class.
>
> Abstractly, it would be good if there were as few as possible hybrid
> APIs, many pure or polymorphic APIs (which it should be in a
> particular case is a pragmatic choice), and a limited number of
> encoding/decoding APIs, which should generally be invoked at the edges
> of the program (e.g., I/O).

Nice summary of part of the 'why' for Python3.

> I still believe that believe that the instances of bytes silently
> succeeding *some* of the time refers to specific bugs in specific
> APIs, either intentional because of misguided compatibility desires,
> or accidental in the haste of trying to convert the entire stdlib to
> Python 3 in a finite time.

I think http://bugs.python.org/issue5468 reports one aspect of haste, 
missing encoding and errors paramaters. But it has not gotten much 
attention.

-- 
Terry Jan Reedy

More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4