A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2002-January/019556.html below:

Unicode file name support for Windows NT, was PEP-time ? ...

[Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ...Martin v. Loewis martin@v.loewis.de
Thu, 17 Jan 2002 12:42:21 +0100
> Sounds like the run-time error solution would at least "solve"
> the issue in terms of making it depend on the used file name
> and underlying OS or file system.

Such a solution is impossible to implement in some case. E.g. on
Windows, if you use the ANSI (*A) APIs to list the directory contents,
Windows will *silently* (AFAIK) give you incorrect file names, i.e. it
will replace unrepresentable characters with the replacement char
(QUESTION MARK).

OTOH, on Unix, there is a better approach for listdir and
unconvertable names: just return the byte strings to the user.

> I'd say: let the different file name based APIs try hard enough
> and then have them bail out if they can't handle the particular
> case.

That is a good idea. However, in case of the WinNT replacement
strategy, the application may still want to know.

Passing *in* Unicode objects is no issue at all: If they cannot be
converted to a reasonable file name, you clearly get an exception.

> > It turns out that only OS X really got it right: For each file, there
> > is both a byte string name, and a Unicode name.
> 
> I suppose this is due to the fact that Mac file systems store
> extended attributes (much like what OS/2 does too) along with the
> file -- that's a really nice way of being able to extend file
> system semantics on a per-file basis; much better than the Windows
> Registry or the MIME guess-by-extension mechanisms.

I'd assume it is different: They just *define* that all local file
systems they have control over use UTF-8 on disk, atleast for BSD ufs;
for HFS, it might be that they 'just know' what encoding is used on an
HFS partition. I doubt they use extended attributes for this, as they
reportedly return UTF-8 even for file systems they've never seen
before; this may be either due to static knowledge (e.g. that VFAT is
UCS-2LE), or through guessing.

It may be that there are also limitations and restrictions, but
atleast they remove the burden from the application.

Regards,
Martin



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4