A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://www.python.org/pipermail/python-dev/2000-July/005366.html below:

[Python-Dev] unicode alphanumerics

[Python-Dev] unicode alphanumericsFinn Bock bckfnn@worldonline.dk
Mon, 03 Jul 2000 17:06:05 GMT
[M.-A. Lemburg]

>"M.-A. Lemburg" wrote:
>> 
>> Fredrik Lundh wrote:
>> > how about this plan:
>> >
>> > -- you add a Py_UNICODE_ALPHA to unicodeobject.h asap,
>> >    which does exactly that (or I can do that, if you prefer).
>> >    (and maybe even a Py_UNICODE_ALNUM)
>> 
>> Ok, I'll add Py_UNICODE_ISALPHA and Py_UNICODE_ISALNUM
>> (first with approximations of the sort you give above and
>> later with true implementations using tables in unicodectype.c)
>> on Monday... gotta run now.
>> 
>> > -- I change SRE to use that asap.
>> >
>> > -- you, I, or someone else add a better implementation,
>> >    some other day.
>
>I've just looked into this... the problem here is what to
>consider as being "alpha" and what "numeric". 
>
>I could add two new tables for the characters with category 'Lo'
>(other letters, not cased) and 'Lm' (letter modifiers)
>to match all letters in the Unicode database, but those
>tables have some 5200 entries (note that there are only 804 lower
>case letters and 686 upper case ones).

In JDK1.3, Character.isLetter(..) and Character.isDigit(..) are 
documented as:

  http://java.sun.com/j2se/1.3/docs/api/java/lang/Character.html#isLetter(char)
  http://java.sun.com/j2se/1.3/docs/api/java/lang/Character.html#isDigit(char)
  http://java.sun.com/j2se/1.3/docs/api/java/lang/Character.html#isLetterOrDigit(char)

I guess that java uses the extra huge tables.

regards,
finn



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4