A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2002-March/021231.html below:

[Python-Dev] PEP 263 - default encoding

[Python-Dev] PEP 263 - default encodingGuido van Rossum guido@python.org
Fri, 15 Mar 2002 14:39:05 -0500
> a. Does this really make sense for UTF-16?  It looks to me like a
> great way to induce bugs of the form "write a unicode literal
> containing 0x0A, then translate it to raw form by stripping the u
> prefix."

Of course not. I don't expect anyone to put UTF-16 in their source
encoding cookie.  But should we bother making a list of encodings that
shouldn't be used?

> b. No editor is likely to implement correct display to distinguish
> between u"" and just "".

That's fine.  Given phase 2, the editor should display the entire file
using the encoding given in the cookie, despite that phase 1 only
applies the encoding to u"" literals.  The rest of the file is
supposed to be ASCII, and if it isn't, that's the user's problem.

> c. This definitely breaks Emacs coding cookie semantics.  Emacs
> applies the coding cookie to the whole buffer.  I don't see a way to
> lose offhand, but this is sufficiently subtle that I don't want to
> break my head trying to prove that you can't lose, either.

I wouldn't worry about that, see above.

> d. You probably have to deprecate ISO 2022 7-bit coding systems, too,
> because people will try to get the representation of a string by
> inputting a raw string in coded form.  This might contain a quote
> character.

Good point.  This sounds like a documentation issue at worst.

> e. This causes problems for UTF-8 transition, since people will want
> to put arbitrary byte strings in a raw string.

I'm not sure I understand.  What do you call a raw string?  Do you
mean an r"" literal?  Why would people want to use that for arbitrary
binary data?  Arbitrary binary data should *always* be encoded using
\xDD hex or \OOO octal escapes.

> But these will not be
> legal UTF-8 files, even though they have a UTF-8 coding cookie.
> People who are trying to do the right thing will have the rules
> changed again later, most likely.

If you're trying to do the right thing you shouldn't be putting
arbitrary binary data in any string literal.

> This means that until editors reliably implement b. and similar
> features, developers must change coding systems to type raw strings
> and Unicode strings.

Sounds like a YAGNI to me.

--Guido van Rossum (home page: http://www.python.org/~guido/)



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4