On Mon, 16 Apr 2001 10:59:11 +1200 (NZST), Graham Guttocks wrote: >Greetings, > >I've run into a performance problem in one of my functions, and wonder >if I could get some recommendations on how to speed things up. > >What I'm trying to do is read in a textfile containing e-mail >addresses, one per line, and use them to build a regular expression >object in the form "address1|address2|address3|addressN" to search >against. > >I'm using string.join to concatenate the addresses together, separated >by a `|'. The problem is that string.join is unacceptably slow in >this task. The following program takes 37 seconds on a PIII/700 to >process a 239-line file! Yes, the number one rule on optimization is not to make assumptions but to profile instead :-) I didn't do that, but I would bet it's the re.compile being called 239 times that wastes the most time. Just do one re.compile after having built the regular expression and it should be a lot faster. Gerhard > >-------------------------------------------------------------------- > >import fileinput, re, string >list = [] > >for line in fileinput.input(textfile): > # Comment or blank line? > if line == '' or line[0] in '#': > continue > else: > list.append(string.strip(line)) > # "address1|address2|address3|addressN" > regex = string.join(list,'|') > regex = '"' + regex + '"' > reo = re.compile(regex, re.I) -- mail: gerhard <at> bigfoot <dot> de web: http://highqualdev.com
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4