Oleg Broytmann wrote: > My filemanager > (Midnight Commander, for the matter) shows these files and directories as > "?????.???", but I can chdir to such directories, and I can open such > files. It would be a big bad blow for me if filemanagers (or other > programs) start to filter these filenames. Summary for those without the time to read the longer version below: - File managers, backup managers and similar apps should use the binary APIs worldwide - Most apps in countries where encoding problems are common will also need to use the binary APIs to be acceptable to their uses - Many apps in countries where the 'native' encoding is UTF-8, ASCII or latin-1 will be able to use the Unicode APIs without any issues whatsoever - Apps targeting a limited, well-controlled execution environment (e.g. web services) will also be able to use the Unicode APIs - I think the binary and Unicode APIs should be available (and fully functional) on all platforms (including Windows) so that app developers don't create portability problems for themselves when they make the decision as to which API to use ------------- The point about *filesystem* apps (i.e. file managers, backup tools, indexing engines) needing to deal with the imperfect world of dodgy filesystem encodings isn't in dispute at all - that's why the binary alternative APIs were added. The point is that there is a spectrum from providing a completely clean solution that addresses only the ideal case of "file paths and other items such as environment variable names and values retrieved from the OS are always well-formed text in the appropriate default encoding" (which will actually work for large chunks of the planet - those where the locals are native ASCII speakers and those where computers didn't start to enter widespread use until after Unicode was already available) to addressing only the most pessimistic case of "you can't trust the default encoding at all, and need to assume that all strings retrieved from the OS contain arbitrary binary data" (which is actually true for some parts of the planet, but thankfully not for all of it). Hopefully people can at least agree that the first extreme is unacceptable because that ideal world doesn't exist. I personally think that the other extreme is *also* unacceptable, because it burdens every single application developer with dealing with a potential problem that quite simply may not be a problem for them because they're in a situation where the naive assumption of a sane operating environment is actually a valid one for their particular application. The idea of parallel Unicode and bytes APIs means that for those with an appropriately limited target environment and/or audience, the Unicode APIs will "just work", while the developers that aren't so lucky can rely on the binary APIs instead. That's actually the one place where I disagree with Guido: I agree with Adam that the binary APIs *should* be available on Windows. The difference would be that whereas on *nix type systems, the bytes APIs are the 'lower level' that more accurately represents the underlying OS, on Windows it would be the other way around, with the Unicode APIs as the lower level ones, and the binary APIs as wrappers around them that automatically decoded the bytes representation to a Unicode one when writing to the OS, and encoded from Unicode to bytes when reading from the OS. If the binary APIs are missing from a major platform (i.e. Windows) then the choice to use them brings with it a major cross-platform portability problem that should really be handled by the standard library. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia ---------------------------------------------------------------
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4