I was going to suggest that if we return mixed sets of unicode/string=20 values from listdir() we could also do the same thing for platforms=20 where FileSystemDefaultEncoding is utf-8, such as MacOSX. But as usual with unicode, when I actually try this it doesn't work, and=20= I don't understand why not. Why is unicode always something that seems=20= so simple and logical until you actually try it??!?!? Here's a transcript of my Python session. The terminal has been set to=20= render in latin-1. The directory contains one file, "fr=F6r"=20 (fr-o-umlaut-r). sap!jack- python Python 2.3a0 (#32, Aug 12 2002, 15:31:25) [GCC 2.95.2 19991024 (release)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.listdir('.') ['fro\xcc\x88r'] >>> utf8name =3D os.listdir('.')[0] >>> unicodename =3D utf8name.decode('utf-8') >>> unicodename u'fro\u0308r' >>> print unicodename.encode('latin-1') Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeError: Latin-1 encoding error: ordinal not in range(256) >>> Sigh. \u0308 is not in the range(256), but the whole point of=20 encode('latin-1') is to make it so, isn't it? And o-umlaut definitely=20 has a latin-1 encoding. I tried the same with macroman in stead of=20 latin-1 (just to make sure this wasn't a bug in the latin-1 encoder),=20 but still no go. What am I doing wrong? -- - Jack Jansen <Jack.Jansen@oratrix.com> =20 http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma=20 Goldman -
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4