OK - it appears everyone agrees we should go the "Unicode API" route. I actually thought my scheme did not preclude moving to this later. This is a much bigger can of worms than I have bandwidth to take on at the moment. As Martin mentions, what will os.listdir() return on Win9x vs Win2k? What does passing a Unicode object to a non-Unicode Win32 platform mean? etc. How do Win95/98/ME differ in their Unicode support? Do the various service packs for each of these change the basic support? So unfortunately this simply means the status quo remains until someone _does_ have the time and inclination. That may well be me in the future, but is not now. It also means that until then, Python programmers will struggle with this and determine that they can make it work simply by encoding the Unicode as an "mbcs" string. Or worse, they will note that "latin1 seems to work" and use that even though it will work "less often" than mbcs. I was simply hoping to automate that encoding using a scheme that works "most often". The biggest drawback is that by doing nothing we are _encouraging_ the user to write broken code. The way things stand at the moment, the users will _never_ pass Unicode objects to these APIs (as they dont work) and will therefore manually encode a string. To my mind this is _worse_ than what my scheme proposes - at least my scheme allows Unicode objects to be passed to the Python functions - python may choose to change the way it handles these in the future. But by forcing the user to encode a string we have lost _all_ meaningful information about the Unicode object and can only hope they got the encoding right. If anyone else decides to take this on, please let me know. However, I fear that in a couple of years we may still be waiting and in the meantime people will be coding hacks that will _not_ work in the new scheme. c'est-la-vie-ly, Mark.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4