Guido van Rossum wrote: > > > As you may have noticed, the Unicode objects provide > > new methods .islower(), .isupper() and .istitle(). Finn Bock > > mentioned that Java also provides .isdigit() and .isspace(). > > > > Question: should Unicode also provide these character > > property methods: .isdigit(), .isnumeric(), .isdecimal() > > and .isspace() ? Plus maybe .digit(), .numeric() and > > .decimal() for the corresponding decoding ? > > What would be the difference between isdigit, isnumeric, isdecimal? > I'd say don't do more than Java. I don't understand what the > "corresponding decoding" refers to. What would "3".decimal() return? These originate in the Unicode database; see ftp://ftp.unicode.org/Public/3.0-Update/UnicodeData-3.0.0.html Here are the descriptions: """ 6 Decimal digit value normative This is a numeric field. If the character has the decimal digit property, as specified in Chapter 4 of the Unicode Standard, the value of that digit is represented with an integer value in this field 7 Digit value normative This is a numeric field. If the character represents a digit, not necessarily a decimal digit, the value is here. This covers digits which do not form decimal radix forms, such as the compatibility superscript digits 8 Numeric value normative This is a numeric field. If the character has the numeric property, as specified in Chapter 4 of the Unicode Standard, the value of that character is represented with an integer or rational number in this field. This includes fractions as, e.g., "1/5" for U+2155 VULGAR FRACTION ONE FIFTH Also included are numerical values for compatibility characters such as circled numbers. u"3".decimal() would return 3. u"\u2155". Some more examples from the unicodedata module (which makes all fields of the database available in Python): >>> unicodedata.decimal(u"3") 3 >>> unicodedata.decimal(u"²") 2 >>> unicodedata.digit(u"²") 2 >>> unicodedata.numeric(u"²") 2.0 >>> unicodedata.numeric(u"\u2155") 0.2 >>> unicodedata.numeric(u'\u215b') 0.125 > > Similar APIs are already available through the unicodedata > > module, but could easily be moved to the Unicode object > > (they cause the builtin interpreter to grow a bit in size > > due to the new mapping tables). > > > > BTW, string.atoi et al. are currently not mapped to > > string methods... should they be ? > > They are mapped to int() c.s. Hmm, I just noticed that int() et friends don't like Unicode... shouldn't they use the "t" parser marker instead of requiring a string or tp_int compatible type ? -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4