There currently is a discussion about how to write Python source code in different encodings on i18n. The (experimental) solution so far has been to add a command line switch to Python which tells the compiler which encoding to expect for u"...strings..." ("...8-bit strings..." will still be used as is -- it's the user's responsibility to use the right encoding; the Unicode implementation will still assume them to be UTF-8 encoded in automatic conversions). In the end, a #pragma should be usable to tell the compiler which encoding to use for decoding the u"..." strings. What we need now, is a good proposal for handling these #pragmas... does anyone have experience with these ? Any ideas ? Here's a simple strawman for the syntax: # pragma key: value parser = re.compile( '^#\s*pragma\s+' '([a-zA-Z_][a-zA-Z0-9_]*):\s*' '(.+)' ) For the encoding this would be something like: # pragma encoding: unicode-escape The compiler would scan these pragma defs, add them to an internal temporary dictionary and use them for all subsequent code it finds during the compilation process. The dictionary would have to stay around until the original compile() call has completed (spanning recursive calls). -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4