Thinking about entering Japanese into raw_input() in IDLE more, I thought I figured a way to give Takeuchi a Unicode string when he enters Japanese characters. I added an experimental patch to the readline method of the PyShell class: if the line just read, when converted to Unicode, has fewer characters but still compares equal (and no exceptions happen during this test) then return the Unicode version. This doesn't currently work because the built-in raw_input() function requires that the readline() call it makes internally returns an 8-bit string. Should I relax that requirement in general? (I could also just replace __builtin__.[raw_]input with more liberal versions supplied by IDLE.) I also discovered that the built-in unicode() function is not idempotent: unicode(unicode('a')) returns u'\000a'. I think it should special-case this and return u'a' ! Finally, I believe we need a way to discover the encoding used by stdin or stdout. I have to admit I know very little about the file wrappers that Marc wrote -- is it easy to get the encoding out of them? IDLE should probably emulate this, as it's encoding is clearly UTF-8 (at least when using Tcl 8.1 or newer). --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4