Martin v. Loewis wrote: > "Fredrik Lundh" <fredrik@pythonware.com> writes: > > >>hmm. I'm tempted to think that there's a major >>flaw in the PEP, caused by the fact that >> >> compile(unicode(script, extract_encoding(script))) >> >>will, from what I can tell, not compile to the same >>thing as: >> >> compile(script) > > > Can you elaborate what you think the difference is? I believe the PEP > is silent on this specific aspect, It does mention this as part of phase 2. > but I think what should happen is > (in the Unicode case): > > - compile will convert the script to UTF-8, which is then tokenized. > - in the process of parsing, the encoding declaration (that presumably > extract_encoding was looking at as well) is recognized, if any. > - Unicode literals are left as-is; byte string literals are converted > back to the original encoding. Right. > So if there is an encoding declaration in script, then I cannot see a > difference. If there is none, the PEP does not elaborate what should > happen. Leaving the byte strings as UTF-8 seems safest, since the only > way to get "correct" non-ASCII strings without the encoding comment is > to use the UTF-8 signature. > > In any case, this can't cause backwards compatibility > problems. compile accepts Unicode strings today only if they can be > converted to a byte string. In the standard installation, this will > fail today if there is non-ASCII in script. So allowing Unicode in > compile is a pure extension. If its precise meaning is underspecified, > it should be clarified before stage 2 is implemented. No need for this. The PEP already mentions it. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4