On 6-dec-03, at 18:48, Skip Montanaro wrote: > Two of the test_unicode_file began failing on my Mac today (fresh cvs > up, OS > X 10.2.8, vanilla unix-style build): > > > ====================================================================== > FAIL: test_directories (__main__.TestUnicodeFiles) > > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "../Lib/test/test_unicode_file.py", line 155, in > test_directories > self._do_directory(TESTFN_ENCODED+ext, TESTFN_ENCODED+ext, > os.getcwd) > File "../Lib/test/test_unicode_file.py", line 103, in > _do_directory > make_name) > AssertionError: '@test-a\xcc\x80o\xcc\x80.dir' != > '@test-\xc3\xa0\xc3\xb2.dir' This is probably related to the two flavors of unicode there are, one which prefers to have all accents separately from the letters as much as possible and one which prefers the reverse. I keep forgetting the names of the two, they're somewhat silly. But the problem is that Python prefers to represent the string "รค" as the two characters "a" and "umlaut on the previous char", and MacOSX prefers to represent the same string as "a with umlaut on it". Or the other way around, this is something else I always forget. And while there are algorithms to convert the combined form of unicode to the uncombined form and vice versa there are no Python codecs to do this. The OSX system calls do the right thing (convert both forms to what it prefers), but when you do a readdir() you don't get the string back you put it. -- Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4