03.10.17 06:29, INADA Naoki пише: > Before deferring re.compile, can we make it faster? > > I profiled `import string` and small optimization can make it 2x faster! > (but it's not backward compatible) Please open an issue for this. > I found: > > * RegexFlag.__and__ and __new__ is called very often. > * _optimize_charset is slow, because re.UNICODE | re.IGNORECASE > > diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py > index 144620c6d1..7c662247d4 100644 > --- a/Lib/sre_compile.py > +++ b/Lib/sre_compile.py > @@ -582,7 +582,7 @@ def isstring(obj): > > def _code(p, flags): > > - flags = p.pattern.flags | flags > + flags = int(p.pattern.flags) | int(flags) > code = [] > > # compile info block Maybe cast flags to int earlier, in sre_compile.compile()? > diff --git a/Lib/string.py b/Lib/string.py > index b46e60c38f..fedd92246d 100644 > --- a/Lib/string.py > +++ b/Lib/string.py > @@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass): > delimiter = '$' > idpattern = r'[_a-z][_a-z0-9]*' > braceidpattern = None > - flags = _re.IGNORECASE > + flags = _re.IGNORECASE | _re.ASCII > > def __init__(self, template): > self.template = template > > patched: > import time: 1191 | 8479 | string > > Of course, this patch is not backward compatible. [a-z] doesn't match > with 'ı' or 'ſ' anymore. > But who cares? This looks like a bug fix. I'm wondering if it is worth to backport it to 3.6. But the change itself can break a user code that changes idpattern without touching flags. There is other way, but it should be discussed on the bug tracker.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4