> Sounds like the run-time error solution would at least "solve" > the issue in terms of making it depend on the used file name > and underlying OS or file system. Such a solution is impossible to implement in some case. E.g. on Windows, if you use the ANSI (*A) APIs to list the directory contents, Windows will *silently* (AFAIK) give you incorrect file names, i.e. it will replace unrepresentable characters with the replacement char (QUESTION MARK). OTOH, on Unix, there is a better approach for listdir and unconvertable names: just return the byte strings to the user. > I'd say: let the different file name based APIs try hard enough > and then have them bail out if they can't handle the particular > case. That is a good idea. However, in case of the WinNT replacement strategy, the application may still want to know. Passing *in* Unicode objects is no issue at all: If they cannot be converted to a reasonable file name, you clearly get an exception. > > It turns out that only OS X really got it right: For each file, there > > is both a byte string name, and a Unicode name. > > I suppose this is due to the fact that Mac file systems store > extended attributes (much like what OS/2 does too) along with the > file -- that's a really nice way of being able to extend file > system semantics on a per-file basis; much better than the Windows > Registry or the MIME guess-by-extension mechanisms. I'd assume it is different: They just *define* that all local file systems they have control over use UTF-8 on disk, atleast for BSD ufs; for HFS, it might be that they 'just know' what encoding is used on an HFS partition. I doubt they use extended attributes for this, as they reportedly return UTF-8 even for file systems they've never seen before; this may be either due to static knowledge (e.g. that VFAT is UCS-2LE), or through guessing. It may be that there are also limitations and restrictions, but atleast they remove the burden from the application. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4