> > Finally, I believe we need a way to discover the encoding used by > > stdin or stdout. I have to admit I know very little about the file > > wrappers that Marc wrote -- is it easy to get the encoding out of > > them? > > I'm not sure what you mean: the name of the input encoding ? > Currently, only the names of the encoding and decoding functions > are available to be queried. Whatever is helpful for a module or program that wants to know what kind of encoding is used. > > IDLE should probably emulate this, as it's encoding is clearly > > UTF-8 (at least when using Tcl 8.1 or newer). > > It should be possible to redirect sys.stdin/stdout using > the codecs.EncodedFile wrapper. Some tests show that raw_input() > doesn't seem to use the redirected sys.stdin though... > > >>> sys.stdin = EncodedFile(sys.stdin, 'utf-8', 'latin-1') > >>> s = raw_input() > äöü > >>> s > '\344\366\374' > >>> s = sys.stdin.read() > äöü > >>> s > '\303\244\303\266\303\274\012' This deserves more looking into. The code for raw_input() in bltinmodule.c certainly *tries* to use sys.stdin. (I think that because your EncodedFile object is not a real stdio file object, it will take the second branch, near the end of the function; this calls PyFile_GetLine() which attempts to call readline().) Aha! It actually seems that your read() and readline() are inconsistent! I don't know your API well enough to know which string is "correct" (\344\366\374 or \303\244\303\266\303\274) but when I call sys.stdin.readline() I get the same as raw_input() returns: >>> from codecs import * >>> sys.stdin = EncodedFile(sys.stdin, 'utf-8', 'latin-1') >>> s = raw_input() äöü >>> s '\344\366\374' >>> s = sys.stdin.read() äöü >>> >>> s '\303\244\303\266\303\274\012' >>> unicode(s) u'\344\366\374\012' >>> s = sys.stdin.readline() äöü >>> s '\344\366\374\012' >>> Didn't you say that your wrapper only wraps read()? Maybe you need to revise that decision! (Note that PyShell doesn't even define read() -- it only defines readline().) --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4