Jack Jansen wrote: > [...] > Here's a transcript of my Python session. The terminal has been set to > render in latin-1. The directory contains one file, "frör" (fr-o-umlaut-r). > sap!jack- python > Python 2.3a0 (#32, Aug 12 2002, 15:31:25) > [GCC 2.95.2 19991024 (release)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import os > >>> os.listdir('.') > ['fro\xcc\x88r'] > >>> utf8name = os.listdir('.')[0] > >>> unicodename = utf8name.decode('utf-8') > >>> unicodename > u'fro\u0308r' U+0308 is not 'LATIN SMALL LETTER O WITH DIAERESIS' but 'COMBINING DIAERESIS', i.e. the ö got decomposed into o + 'COMBINING DIAERESIS'. > [...] Bye, Walter Dörwald
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4