> Now that I can edit UTF-8 directly, I find a "feature" made > possible by the PEP 263 support of Python 2.3 rather > puzzling: > > Let's say I edit a file testencoding.py in XEmacs with UTF-8 > support: (Note that I'm viewing this as Latin-1. The comment, s and u in the source are all three the same: a-umlaut, o-umlaut, u-umlaut.) > # -*- coding: utf-8; -*- > # comment äöü > s = "äöü" > u = u"äöü" > print s > print u.encode('latin-1') > print 'works !' > > With Python 2.3 this prints: > > äöü > äöü > works ! > > I would have expected that s turns out as "äöü" using print, > since that's how I wrote it in the source file. No, because stdout isn't assumed to be UTF-8. The string s is your string encoded in UTF-8, and those are the bytes written by print. > This suggests to me that mixing string and Unicode literals > using non-ASCII characters in a single file should probably > be avoided. Or it suggests that we need a way to deal with encodings on stdout more gently. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4