On 9/2/05, Gareth McCaughan <gmccaughan at synaptics-uk.com> wrote: > On Thursday 2005-09-01 18:09, Guido van Rossum wrote: > > > They *are* cached and there is no cost to using the functions instead > > of the methods unless you have so many regexps in your program that > > the cache is cleared (the limit is 100). > > Sure there is; the cost of looking them up in the cache. > > >>> import re,timeit > > >>> timeit.re=re > >>> timeit.Timer("""re.search(r"(\d*).*(\d*)", "abc123def456")""").timeit(1000000) > 7.6042091846466064 > > >>> timeit.r = re.compile(r"(\d*).*(\d*)") > >>> timeit.Timer("""r.search("abc123def456")""").timeit(1000000) > 2.6358869075775146 > > >>> timeit.Timer().timeit(1000000) > 0.091850996017456055 > > So in this (highly artificial toy) application it's about 7.5/2.5 = 3 times > faster to use the methods instead of the functions. Yeah, but the cost is a constant -- it is not related to the cost of compiling the re. (You should've shown how much it cost if you included the compilation in each search.) I haven't looked into this, but I bet the overhead you're measuring is actually the extra Python function call, not the cache lookup itself. I also notice that _compile() is needlessly written as a varargs function -- all its uses pass it exactly two arguments. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4