Fredrik Lundh wrote: > > when looking through skip's coverage listing, I noted a bug in > SRE: > > #define SRE_UNI_IS_ALNUM(ch) ((ch) < 256 ? isalnum((ch)) : 0) > > this predicate is used for \w when a pattern is compiled using > the "unicode locale" (flag U), and should definitely not use 8-bit > locale stuff. > > however, there's no such thing as a Py_UNICODE_ISALNUM > (or even a Py_UNICODE_ISALPHA). what should I do? how > about using: > > Py_UNICODE_ISLOWER || > Py_UNICODE_ISUPPER || > Py_UNICODE_ISTITLE || > Py_UNICODE_ISDIGIT This will give you all cased chars along with all digits; it ommits the non-cased ones. It's a good start, but probably won't cover the full range of letters + numbers. Perhaps we need another table for isalpha in unicodectype.c ? (Or at least one which defines all non-cased letters.) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4