RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2000-May/003855.html below:

[I18n-sig] Re: [Python-Dev] Unicode debate

[I18n-sig] Re: [Python-Dev] Unicode debateNeil Hodgson nhodgson@bigpond.net.au
Tue, 2 May 2000 21:40:44 +1000

Previous message: [I18n-sig] Re: [Python-Dev] Unicode debate
Next message: [I18n-sig] Re: [Python-Dev] Unicode debate
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

>    u = aUnicodeStringFromSomewhere
>    s = an8bitStringFromSomewhere
>
>    DoSomething(s + u)

> in Guido's design, the first example may or may not result in
> an "UTF-8 decoding error: UTF-8 decoding error: unexpected
> code byte" exception.

   I would say it is less surprising for most people for this to follow the
silent-widening of each byte - the Fredrik-Paul position. With the current
scarcity of UTF-8 code, very few people will expect an automatic UTF-8 to
UTF-16 conversion. While complete prohibition of automatic conversion has
some appeal, it will just be more noise to many.

>    u = aUnicodeStringFromSomewhere
>    s = an8bitStringFromSomewhere
>
>    if len(u) + len(s) == len(u + s):
>        print "true"
>    else:
>        print "not true"

> the second example may result in a
> similar error, print "true", or print "not true", depending on the
> contents of the 8-bit string.

   I don't see this as important as its trying to take the Unicode strings
are equivalent to 8 bit strings too far. How much further before you have to
break? I always thought of len measuring the number of bytes rather than
characters when applied to strings. The same as strlen in C when you have a
DBCS string.

   I should correct some of the stuff Mark wrote about me. At Fujitsu we did
a lot more DBCS work than Unicode because that's what Japanese code uses.
Even with Java most storage is still DBCS. I was more involved with Unicode
architecture at Reuters 6 or so years ago.

   Neil

Previous message: [I18n-sig] Re: [Python-Dev] Unicode debate
Next message: [I18n-sig] Re: [Python-Dev] Unicode debate
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4