There's a new version of the patch http://www.python.org/sf/757624 I understood there was some problems with the first implementation, and changed some aspects of how the new implementation works to fix them. This new implementation is more complex than the other implementations (mine, and the original), given the many details it covers, but is probably faster than these implementations in all situations (tests very welcome). Additionally, it introduces new features like infinite loop protection (*matching* the REs, not preventing their use). Here is a list of valid REs not accepted by the original implementation, and gracefully working on the new implementation (these used to blow up SRE with an infinite loop, not because the implementation was recursive). re.match("(a?)+ab", "ab") re.match("^(a|b?)+a$", "a") re.match("(a*)+?ac", "ab") re.match("((a)*)+?ac", "ab") re.match("(?=a)*", "a") Here is a list of unusual REs I've built while I was understanding how complex the operations could be. All of them work. re.search('(.*?)(?<!-):', 'bc-:') re.match("(.+)+B", "AB") re.match("^(a+){2}b$", "aaaaab").groups() re.match("^(a+)+a$", "aaaaaa").groups() re.match("^((a|b?)a)+a$", "aa").groups() The following RE is a good example of how *not to* build a regular expression. The current implementation hangs for a *long* time while executing it (more than many minutes.. I wasn't able to wait). The new implementation takes as long as 2 seconds. Notice that even the new implementation hangs (exponentially) with larger strings. re.match("(a+?)+?ac", "a"*1000+"b") The following SF issues were fixed, without being specifically addressed (IOW, they just work :-): * 695688 * 753711 * 725149 As a matter of giving credits, I've included part of the lastmark patch by Greg Chapman (712900). Now, I understand that this is complex stuff, thus any kind of help is very welcome. Here is how you can help: * provide benchmarks, good or bad, or provide "ready-to-benchmark" code; * provide information about tests you've done, good or bad, or provide "ready-to-test" code; Also, I don't know about any bugs in the SRE engine which are not fixed in the new implementation. If you have any non-working code or some SF bug I'm not aware of, please let me know. Ahhh.. now I'll relax a little bit. :-) -- Gustavo Niemeyer http://niemeyer.net
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4