> > http://www.python.org/peps/pep-0277.html > > > > The PEP describes a Windows-only change to Unicode in file names: On > > Windows NT/2k/XP, Python would allow arbitrary Unicode strings as file > > names and pass them to the OS, instead of converting them to CP_ACP > > first. This applies to open() and all os functions that accept > > filenames. > > > > In addition, os.list() would return Unicode filenames if the argument > > is Unicode. > > This is the bit I still don't like (at least, if I'm not > mistaken I commented on it a while ago too). A routine could be > doing an os.list() expecting strings, but suddenly someone > passes it a unicode directoryname and the return value would > change. Hm, that would be the responsibility of whoever passes it Unicode. Most code works just fine when presented with Unicode where 8-bit strings are expected. It's only code that assumes the 8-bit strings are Latin-1 (or something else besides ASCII) that gets in trouble. But shouldn't it return Unicode whenever there are filenames in the directory that can't represented as ASCII? That's what Tkinter does: Tk gives back UTF-8, which degenerates to ASCII if there are only ASCII chars; if any high bits are detected, Tkinter decodes the UTF-8, turning the return string into Unicode. > I would much prefer an optional encoding argument whereby you > give the encoding in which you want the return value. Default > would be the local filesystem encoding. If you pass unicode you > will get direct unicode on XP/2K, and a converted string on > other platforms (but always unicode). Hm, I don't know if I'd like os.listdir() to have an encoding argument. Sounds like the wrong solution somehow. > Oh yes, the same reasoning would hold for readlink(), getcwd() > and any other call that returns filenames. Ditto. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4