RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2000-May/003907.html below:

[I18n-sig] Re: [Python-Dev] Unicode debate

[I18n-sig] Re: [Python-Dev] Unicode debateM.-A. Lemburg mal@lemburg.com
Wed, 03 May 2000 01:05:28 +0200

Previous message: [I18n-sig] Re: [Python-Dev] Unicode debate
Next message: [I18n-sig] Re: [Python-Dev] Unicode debate
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Paul Prescod wrote:
> 
> Combining characters are a whole 'nother level of complexity. Charater
> sets are hard. I don't accept that the argument that "Unicode itself has
> complexities so that gives us license to introduce even more
> complexities at the character representation level."
> 
> > FYI: Normalization is needed to make comparing Unicode
> > strings robust, e.g. u"é" should compare equal to u"e\u0301".
> 
> That's a whole 'nother debate at a whole 'nother level of abstraction. I
> think we need to get the bytes/characters level right and then we can
> worry about display-equivalent characters (or leave that to the Python
> programmer to figure out...).

I just wanted to point out that the argument "slicing doesn't
work with UTF-8" is moot.

I do see a point against UTF-8 auto-conversion given the example
that Guido mailed me:

"""
s = 'ab\341\210\264def'        # == str(u"ab\u1234def")
s.find(u"def")

This prints 3 -- the wrong result since "def" is found at s[5:8], not
at s[3:6].
"""

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/

Previous message: [I18n-sig] Re: [Python-Dev] Unicode debate
Next message: [I18n-sig] Re: [Python-Dev] Unicode debate
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4