> The problem with this, and other preceding schemes that have been > discussed here, is that there is no means of ascertaining whether a > particular file name str was obtained from a str API, or was funny- > decoded from a bytes API... and thus, there is no means of reliably > ascertaining whether a particular filename str should be passed to a > str API, or funny-encoded back to bytes. Why is it necessary that you are able to make this distinction? > Picking a character (I don't find U+F01xx in the > Unicode standard, so I don't know what it is) It's a private use area. It will never carry an official character assignment. > As I realized in the email-sig, in talking about decoding corrupted > headers, there is only one way to guarantee this... to encode _all_ > character sequences, from _all_ interfaces. Basically it requires > reserving an escape character (I'll use ? in these examples -- yes, an > ASCII question mark -- happens to be illegal in Windows filenames so > all the better on that platform, but the specific character doesn't > matter... avoiding / \ and . is probably good, though). I think you'll have to write an alternative PEP if you want to see something like this implemented throughout Python. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4