2008/9/30 Glenn Linderman <v+python at g.nevcal.com>: > So the problem is that a Unicode file system interface can't deal with > non-UTF-8 byte streams as file names. > > So it seems there are four suggested approaches, all of which have aspects > that are inconvenient. Let's not forget what happens when a non-UTF-8 file name is read from a file or written to a file, under the assumption that the filename is written to the file directly (which probably breaks for filenames containing newlines or such). > 4) Use of bytes APIs on FS interfaces. This seems to be the "solution" > adopted by Posix that creates the "problem" encountered by Unicode-native > applications. It is cumbersome to deal with within applications that > attempt to display the names. What do Posix-style "open file" dialog boxes > do in this case? http://library.gnome.org/devel/glib/stable/glib-Character-Set-Conversion.html#g-filename-display-name I used to observe three different ways to display such filenames within gedit (including %xx and \xx escapes), but now it is consistent, probably because it switched to using the above function everywhere: $ touch $'abc\xffz' $ gedit The Open dialog shows: abc�z (invalid encoding) When the file is open, the window title and the tab title show: abc�z and the same is in recent file list. It has a bug: it appends " (invalid encoding)" even if the filename contains a correctly encoded U+FFFD character. Nautilus has the same behavior and the same bug because this is a design bug of that function which does not allow to tell whether the conversion was successful. A filename containing a newline is sometimes displayed in two lines, and sometimes with a U+000A character from a fallback font (hex character number in a box). -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4