RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2000-September/009470.html below:

[Python-Dev] Disabling Unicode readbuffer interface

[Python-Dev] Disabling Unicode readbuffer interfaceMartin v. Loewis martin@loewis.home.cs.tu-berlin.de
Thu, 21 Sep 2000 18:19:53 +0200

Previous message: [Python-Dev] Disabling Unicode readbuffer interface
Next message: [Python-Dev] Disabling Unicode readbuffer interface
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> Martin, haven't you read my last post to Guido ? 

I've read

http://www.python.org/pipermail/python-dev/2000-September/016162.html

where you express a preference of disabling the getreadbuf slot, in
addition to special-casing Unicode objects in s#. I've just tested the
effects of your solution 1 on the test suite. Or are you referring to
a different message?

> Completely disabling getreadbuf is not a solution worth considering --
> it breaks far too much code which the test suite doesn't even test,
> e.g. MarkH's win32 stuff produces tons of Unicode object which
> then can get passed to potentially all of the stdlib. The test suite
> doesn't check these cases.

Do you have any specific examples of what else would break? Looking at
all occurences of 's#' in the standard library, I can't find a single
case where the current behaviour would be right - in all cases raising
an exception would be better. Again, any counter-examples?

>     Special case Unicode in getargs.c's code for "s#" only and leave
>     getreadbuf enabled. "s#" could then return the default encoded
>     value for the Unicode object while SRE et al. could still use 
>     PyObject_AsReadBuffer() to get at the raw data.

I think your option 2 is acceptable, although I feel the option 1
would expose more potential problems. What if an application
unknowingly passes a unicode object to md5.update? In testing, it may
always succeed as ASCII-only data is used, and it will suddenly start
breaking when non-ASCII strings are entered by some user. 

Using the internal rep would also be wrong in this case - the md5 hash
would depend on the byte order, which is probably not desired (*).

In any case, your option 2 would be a big improvement over the current
state, so I'll just shut up.

Regards,
Martin

(*) BTW, is there a meaningful way to define md5 for a Unicode string?

Previous message: [Python-Dev] Disabling Unicode readbuffer interface
Next message: [Python-Dev] Disabling Unicode readbuffer interface
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4