A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2010-July/102276.html below:

[Python-Dev] New regex module for 3.2?

[Python-Dev] New regex module for 3.2?Georg Brandl g.brandl at gmx.net
Fri Jul 23 15:45:52 CEST 2010
Am 23.07.2010 11:16, schrieb Hrvoje Niksic:
> On 07/22/2010 01:34 PM, Georg Brandl wrote:
>> Timings (seconds to run the test suite):
>>
>> re     26.689  26.015  26.008
>> regex  26.066  25.797  25.865
>>
>> So, I thought there wasn't a difference in performance for this use case
>> (which is compiling a lot of regexes and matching most of them only a
>> few times in comparison).  However, I found that looking at the regex
>> caching is very important in this case: re._MAXCACHE is by default set to
>> 100, and regex._MAXCACHE to 1024.  When I set re._MAXCACHE to 1024 before
>> running the test suite, I get times around 18 (!) seconds for re.
> 
> This seems to point to re being significantly *faster* than regexp, even 
> in matching, and as such may be something the author would want to look 
> into.
> 
> Nick writes:
> 
>  > That still fits with the compile/match performance trade-off changes
>  > between re and regex though.
> 
> The performance trade-off should make regex slower with sufficiently 
> small compiled regex cache, when a lot of time is wasted on compilation. 
>   But as the cache gets larger (and, for fairness, of the same size in 
> both implementations), regex should outperform re.  Georg, would you 
> care to measure if there is a difference in performance with an even 
> larger cache?

I did measure that, and there are no significant differences in timing.

I also did the check the other way around, and restricting regex._MAXCACHE
to 100 I got from 26 seconds to 42 seconds. (Nick, is that enough data to
calculate A and B now? ;)

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4