On Thu, 2005-11-24 at 14:11 +0000, Duncan Grisby wrote: > Hi, > > I posted this to comp.lang.python, but got no response, so I thought I > would consult the wise people here... > > I have encountered a problem with the re module. I have a > multi-threaded program that does lots of regular expression searching, > with some relatively complex regular expressions. Occasionally, events > can conspire to mean that the re search takes minutes. That's bad > enough in and of itself, but the real problem is that the re engine > does not release the interpreter lock while it is running. All the > other threads are therefore blocked for the entire time it takes to do > the regular expression search. I don't know if this will help, but in my experience compiling re's often takes longer than matching them... are you sure that it's the match and not a compile that is taking a long time? Are you using pre-compiled re's or are you dynamically generating strings and using them? > Is there any fundamental reason why the re module cannot release the > interpreter lock, for at least some of the time it is running? The > ideal situation for me would be if it could do most of its work with > the lock released, since the software is running on a multi processor > machine that could productively do other work while the re is being > processed. Failing that, could it at least periodically release the > lock to give other threads a chance to run? > > A quick look at the code in _sre.c suggests that for most of the time, > no Python objects are being manipulated, so the interpreter lock could > be released. Has anyone tried to do that? probably not... not many people would have several-minutes-to-match re's. I suspect it would be do-able... I suggest you put together a patch and submit it on SF... -- Donovan Baarda <abo at minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4