> I'd like to work on adding support for non-ASCII characters > in identifiers, using the following principles: > > 1. At run-time, identifiers are represented as Unicode > objects unless they are pure ASCII. IOW, they are > converted from the source encoding to Unicode objects > in the process of parsing. > > 2. As a consequence of 1), all places there identifiers > appear need to support Unicode objects (e.g. __dict__, > __getattr__, etc) > > 3. Legal non-ASCII identifiers are what legal non-ASCII > identifiers are in Java, except that Python may use > a different version of the Unicode character database. > Python would share the property that future versions > allow more characters in identifiers than older versions. > > If you are too lazy too look up the Java definition, > here is a rough overview: > An identifier is "JavaLetter JavaLetterOrDigit*" > > JavaLetter is a character of the classes Lu, Ll, > Lt, Lm, or Lo, or a currency symbol (for Python: > excluding $), or a connecting punctuation character > (which is unfortunately underspecified - will > research the implementation). > > JavaLetterOrDigit is a JavaLetter, or a digit, > a numeric letter, a combining mark, a non-spacing > mark, or an ignorable control character. > > Does this need a PEP? Sure does. Since this could create a serious burden for code protability, I'd like to see a serious section on motivation and discussion on how to keep Unicode out of the standard library and out of most 3rd party distributions. Without that I'm strongly -1. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4