On Sep 30, 2008, at 5:40 PM, Martin v. Löwis wrote: >>> On Windows, we might reject bytes filenames for all file >>> operations: open(), >>> unlink(), os.path.join(), etc. (raise a TypeError or UnicodeError) >> >> Since I've seen no objections to this yet: please no. If we offer a >> "lower-level" bytes filename API, it should work for all platforms. > > Unfortunately, it can't. You cannot represent all possible file names > in a byte string in Windows (just as you can't do so in a Unicode > string on Unix). As you mention in the parenthetical below, of course it can. > So using byte strings on Windows would work for some files, but fail > for others. In particular, listdir might give you a list of file names > which you then can't open/stat/recurse into. > > (of course, you could use UTF-8 as the file system encoding on > Windows, > but then you will have to rewrite a lot of C code first) Yes! If there is a byte-string access method for Windows, pretty please make it decode from UTF-8 internally and call the Unicode version of the Windows APIs. The non-unicode windows APIs are pretty much just broken -- Ideally, Python should never be calling those. But, I still don't like the idea of propagating the "sometimes a string, sometimes bytes" APIs...One or the other, please. Either always strings (if and only if a method for assuring decoding always succeeds), or always bytes. James
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4