Guido van Rossum wrote: >>Guido van Rossum wrote: >> >>>But if you pass the normalized string (or the Latin-1 string) to >>>open(), will it find the file? >> >>I tried opening a file using both "o\xcc\x88" and "\xc3\xb6". Both >>result in the same file being opened. >> >> >>>I.e. if the filesystem has the >>>unnormalized name stored in its directory, will filesystem requests >>>normalize filenames before comparing them? >> >>It could be that Apple is decomposing the filenames before comparing >>them. Either way works. The recommended way of doing normalization is to go by Normalization Form C: Canonical Decomposition, followed by Canonical Composition. See http://www.unicode.org/unicode/reports/tr15/#Specification Note that for proper collation suppotr, Unicode strings mus first be normalized. See http://www.unicode.org/unicode/reports/tr10/#Main_Algorithm > Hm, that sucks (either way) -- because you get unnormalized Unicode > out of directory listings, which is harder to turn into local > encodings. You can easily normalize it again (provided you have a normalization lib at hand). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4