On 3/23/07, Fredrik Lundh <fredrik at pythonware.com> wrote: > Bartlomiej Wolowiec wrote: > > > For some time I'm interested in regular expressions and Finite State Machine. > > Recently, I saw that Python uses "Secret Labs' Regular Expression Engine", > > which very often works too slow. Its pesymistic time complexity is O(2^n), > > although other solutions, with time complexity O(n*m) ( O(n*m^2), m is the > > length of the regular expression and n is the length of the text, > > introduction to problem: http://swtch.com/~rsc/regexp/regexp1.html ) > > that article almost completely ignores all the subtle capturing and left- > to-right semantics that a perl-style engine requires, though. trust me, > this is a much larger can of worms than you might expect. but if you're > ready to open it, feel free to hack away. > > > major part of regular expressions > > the contrived example you used has nothing whatsoever to do with > "major part of regular expressions" as seen in the wild, though. I'd > be much more interested in optimizations that focuses on patterns > you've found in real code. A fruitful direction that is not as ambitious as re-writing the entire engine would be to add "independent group" assertions to python's RE syntax [ (?> ... ) in perl]. They are rather handy for optimizing the malperforming cases alluded to here (which rarely occur as the OP posted, but tend to crop up in less malignant forms). -Mike
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4