A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2003-February/033232.html below:

[Python-Dev] Unicode source code

[Python-Dev] Unicode source codeJust van Rossum just@letterror.com
Sun, 9 Feb 2003 18:22:45 +0100
M.-A. Lemburg wrote:

> Now, to accept Unicode it would probably be worthwhile hooking
> into this chain at step 2 rather than step 1 (the code for the
> tokenizer is in Parser/tokenizer.c, the compiler code in
> Python/compiler.c), however, this is difficult because most
> APIs for compiling code are built on char* buffers.
>
> A short-term solution would probably be to convert Unicode to
> UTF-8 and prepend a UTF-8 BOM mark so that the tokenizer
> knows that it is getting UTF-8. Haven't tested this though.

Hm. What I'm looking into now is to simply define a PyCompilerFlags flag
called PyCF_SOURCE_IS_UTF8. eval() and compile() will then convert a
unicode string to utf-8 and set this flag. This seems a very low-impact
solution. Does this make sense?

Just



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4