RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2000-July/006615.html below:

[Python-Dev] decoding errors when comparing strings

[Python-Dev] decoding errors when comparing stringsM.-A. Lemburg mal@lemburg.com
Sat, 15 Jul 2000 19:15:05 +0200

Previous message: [Python-Dev] decoding errors when comparing strings
Next message: [Python-Dev] decoding errors when comparing strings
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Fredrik Lundh wrote:
> 
> with the locale aware encoding mechanisms switched off
> (sys.getdefaultencoding() == "ascii"), I stumbled upon some
> interesting behaviour:
> 
> first something that makes sense:
> 
>     >>> u"abc" == "abc"
>     1
> 
>     >>> u"åäö" == "abc"
>     0
> 
> but how about this one:
> 
>     >>> u"abc" == "åäö"
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII decoding error: ordinal not in range(128)
> 
> or this one:
> 
>     >>> u"åäö" == "åäö"
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII decoding error: ordinal not in range(128)
> 
> ignoring implementation details for a moment, is this really the
> best we can do?

This is merely due to the fact that on your Latin-1 platform,
"ä" and u"ä" map to the same ordinals. The unicode-escape
codec (which is used by the Python parser) takes single
characters in the whole 8-bit range as Unicode ordinals, so
u"ä" really maps to unichr(ord("ä")).

The alternative would be forcing usage of escapes for non-ASCII
Unicode character literals and issuing an error for all non-ASCII
ones.

BTW, I have a feeling that we should mask the decoding errors
during compares in favour of returning 0... 

...otherwise the current dictionary would bomb (it doesn't do any
compare error checking !) in case a Unicode string happens to have
the same hash value as an 8-bit string key. (Can't test this right now,
but this is what should happen according to the C sources.)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/

Previous message: [Python-Dev] decoding errors when comparing strings
Next message: [Python-Dev] decoding errors when comparing strings
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4