[Guido] > ... > While I'm on the topic, I don't see in your proposal a description of > the source file character encoding. Currently, this is undefined, and > in fact can be (ab)used to enter non-ASCII in string literals. > ... > What should we do about this? The safest and most radical solution is > to disallow non-ASCII source characters; François will then have to > type > > print u"Written by Fran\u00E7ois." > > but, knowing François, he probably won't like this solution very much > (since he didn't like the \347 version either). So long as Python opens source files using libc text mode, it can't guarantee more than C does: the presence of any character other than tab, newline, and ASCII 32-126 inclusive renders the file contents undefined. Go beyond that, and you've got the same problem as mailers and browsers, and so also the same solution: open source files in binary mode, and add a pragma specifying the intended charset. As a practical matter, declare that Python source is Latin-1 for now, and declare any *system* that doesn't support that non-conforming <wink>. python-is-the-measure-of-all-things-ly y'rs - tim
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4