François Pinard wrote: >>1. At run-time, identifiers are represented as Unicode objects unless >>they are pure ASCII. IOW, they are converted from the source encoding >>to Unicode objects in the process of parsing. > > > This is already the case, isn't it? Currently, all identifiers are byte strings, at run-time, representing ASCII characters. IOW, you currently won't observe Unicode strings as identifiers. >>2. As a consequence of 1), all places there identifiers appear need to >>support Unicode objects (e.g. __dict__, __getattr__, etc) > > > I do not much know the internals, yet I suspect one more thing to > consider is whether Unicode strings looking like non-ASCII identifiers > should be interned or not, the same as currently done for ASCII. Indeed; I had not thought about this. > # -*- coding: Latin-1 -*- > élève = 3 > print élève [...] > So, the Python compiler is sensitive to the active locale. Yes, that's a bug. It will use byte strings as identifiers (without running your example, I'd expect that dir() shows they are UTF-8) > This is kind of an happy bug! May we count on it being supported in the > interim? :-) :-) I would think so: this bug has been present for quite some time, and nobody complained :-) Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4