RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2002-March/021350.html below:

[Python-Dev] PEP 263 considered faulty (for some Japanese)

[Python-Dev] PEP 263 considered faulty (for some Japanese)Paul Prescod paul@prescod.net
Mon, 18 Mar 2002 07:53:10 -0800

Previous message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
Next message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

"Stephen J. Turnbull" wrote:
> 
>...
> 
> I don't see any need for a deviation of the implementation from the
> spec.  Just slurp in the whole file in the specified encoding. 

That's phase 2. It's harder to implement so it won't be in Python 2.3.
They are trying to get away with changing the *output* of the
lexer/parser rather than the *input* because the lexer/parser code
probably predates Unicode and certainly predates Guido's thinking about
internationalization issues. We're moving in baby steps.

> ... Then
> cast the Unicode characters in ordinary literal strings down to
> bytesize (my preference, probably with errors on Latin-1<0.5 wink>) or
> reencode them (Guido's and your suggestion).  People who don't like
> the results in their non-Unicode literal strings (probably few) should
> use hex escapes.  Sure, you'll have to rewrite the parser in terms of
> UTF-16.  But I thought that was where you were going anyway.

Sure, but a partial implementation now is better than a perfect
implementation at some unspecified time in the future.

> If not, it should be nearly trivial to rewrite the parser in terms of
> UTF-8 (since it is a superset of ASCII and non-ASCII is currently only
> allowed in comments or guarded by a (Unicode)? string literal AFAIK).
> The main issue would be anything that involves counting characters
> (not bytes!), I think.  

That would be an issue. Plus it would be the first place that the Python
interpreter used UTF-8 as an internal representation. So it would also
be a half-step, but it might involve more redoing later.

 Paul Prescod

Previous message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
Next message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4