A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2000-July/005290.html below:

[Python-Dev] unicode alphanumerics

[Python-Dev] unicode alphanumericsM.-A. Lemburg mal@lemburg.com
Sat, 01 Jul 2000 18:56:12 +0200
Fredrik Lundh wrote:
> 
> when looking through skip's coverage listing, I noted a bug in
> SRE:
> 
> #define SRE_UNI_IS_ALNUM(ch) ((ch) < 256 ? isalnum((ch)) : 0)
> 
> this predicate is used for \w when a pattern is compiled using
> the "unicode locale" (flag U), and should definitely not use 8-bit
> locale stuff.
> 
> however, there's no such thing as a Py_UNICODE_ISALNUM
> (or even a Py_UNICODE_ISALPHA).  what should I do?  how
> about using:
> 
>     Py_UNICODE_ISLOWER ||
>     Py_UNICODE_ISUPPER ||
>     Py_UNICODE_ISTITLE ||
>     Py_UNICODE_ISDIGIT

This will give you all cased chars along with all digits;
it ommits the non-cased ones.

It's a good start, but probably won't cover the full range
of letters + numbers.

Perhaps we need another table for isalpha in unicodectype.c ?
(Or at least one which defines all non-cased letters.)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4