RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/2000-May/003896.html below:

[I18n-sig] Re: [Python-Dev] Unicode debate

[I18n-sig] Re: [Python-Dev] Unicode debateGuido van Rossum guido@python.org
Tue, 02 May 2000 14:56:34 -0400

Previous message: [I18n-sig] Re: [Python-Dev] Unicode debate
Next message: [I18n-sig] Re: [Python-Dev] Unicode debate
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> It's the naive user who will be surprised by these random UTF-8 decoding
> errors. 
> 
> That's why this is NOT a convenience issue (are you listening MAL???).
> It's a short and long term simplicity issue. There are lots of languages
> where it is de rigeur to discover and work around inconvenient and
> confusing default behaviors. I just don't think that we should be ADDING
> such behaviors.

So what do you think of my new proposal of using ASCII as the default
"encoding"?  It takes care of "a character is a character" but also
(almost) guarantees an error message when mixing encoded 8-bit strings
with Unicode strings without specifying an explicit conversion --
*any* 8-bit byte with the top bit set is rejected by the default
conversion to Unicode.

I think this is less confusing than Latin-1: when an unsuspecting user
is reading encoded text from a file into 8-bit strings and attempts to
use it in a Unicode context, an error is raised instead of producing
garbage Unicode characters.

It encourages the use of Unicode strings for everything beyond ASCII
-- there's no way around ASCII since that's the source encoding etc.,
but Latin-1 is an inconvenient default in most parts of the world.
ASCII is accepted everywhere as the base character set (e.g. for
email and for text-based protocols like FTP and HTTP), just like
English is the one natural language that we can all sue to communicate
(to some extent).

--Guido van Rossum (home page: http://www.python.org/~guido/)

Previous message: [I18n-sig] Re: [Python-Dev] Unicode debate
Next message: [I18n-sig] Re: [Python-Dev] Unicode debate
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4