> I reversed again, posixmodule now detects Unicode arguments and handles > them in UCS-2 rather than converting to UTF-8 and back again. This now looks > like the right way to me. The total amount of code bloat is about 8K over a > 150K file and this doesn't appear to be too much for me. I agree. We still should keep "mbcs", so extension modules that don't want to go through the troubles of special-casing Windows will be able to get it right most of the time. > A check is made to see if the platform supports Unicode file names and if > it does not then the old conversion to Py_FileSystemDefaultEncoding is done. > This means that Windows 9x should work the same as it currently does. This > check is exposed as os.unicodefilenames() so that client code can decide > whether to use Unicode. That has unclear semantics for me. It sounds like "if true, you can pass Unicode strings to open etc." However, then it should return 1 on all systems, since you always can - the default encoding may apply, and restrict file names to ASCII. Or, it may mean "if true, you can pass all Unicode strings to open". This is not true, either, because there are always reserved characters (such as the path delimiter). > For other OSs that can support Unicode file names, adiitional cases can > be added into posixmodule. The other platforms (OS X for example) may not > provide these functions as taking UCS-2 arguments but instead UTF-8 > arguments. They should still work similarly to the NT code but encode into > UTF-8 before making system calls. I think this is not needed. Instead, using setting the file system encoding to UTF-8 should be sufficient. > After waiting a while for comments, I'll package this up as a patch. Very good. Would you also write the PEP? If not, I will, but that may take some time. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4