A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2000-May/003865.html below:

[Python-Dev] Re: [I18n-sig] Unicode debate

[Python-Dev] Re: [I18n-sig] Unicode debateJust van Rossum just@letterror.com
Tue, 2 May 2000 14:39:24 +0100
At 1:42 AM -0700 02-05-2000, Ka-Ping Yee wrote:
>If it turns out automatic conversions *are* absolutely necessary,
>then i vote in favour of the simple, direct method promoted by Paul
>and Fredrik: just copy the numerical values of the bytes.  The fact
>that this happens to correspond to Latin-1 is not really the point;
>the main reason is that it satisfies the Principle of Least Surprise.

Exactly.

I'm not sure if automatic conversions are absolutely necessary, but seeing
8-bit strings as Latin-1 encoded Unicode strings seems most natural to me.
Heck, even 8-bit strings should have an s.encode() method, that would
behave *just* like u.encode(), and unicode(blah) could even *return* an
8-bit string if it turns out the string has no character codes > 255!

Conceptually, this gets *very* close to the ideal of "there is only one
string type", and at the same times leaves room for 8-bit strings doubling
as byte arrays for backward compatibility reasons.

(Unicode strings and 8-bit strings could even be the same type, which only
uses wide chars when neccesary!)

Just





RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4