Hi Sven, On 26 January 2017 at 22:13, Sven R. Kunze <srkunze at mail.de> wrote: > I recently refreshed regular expressions theoretical basics *indulging in > reminiscences* So, I read https://swtch.com/~rsc/regexp/regexp1.html Theoretical regular expressions and what Python/Perl/etc. call regular expressions are a bit different. You can read more about it at https://en.wikipedia.org/wiki/Regular_expression#Implementations_and_running_times . Discussions about why they are different often focus on backreferences, which is a rare feature. Let me add two other points. The theoretical kind of regexp is about giving a "yes/no" answer, whereas the concrete "re" or "regexp" modules gives a match object, which lets you ask for the subgroups' location, for example. Strange at it may seem, I am not aware of a way to do that using the linear-time approach of the theory---if it answers "yes", then you have no way of knowing *where* the subgroups matched. Another issue is that the theoretical engine has no notion of greedy/non-greedy matching. Basically, you walk over the source character and it answers "yes" or "no" after each of them. This is different from a typical backtracking implementation. In Python: >>> re.match(r'a*', 'aaa') >>> re.match(r'a*?', 'aaa') This matches either three or zero characters in Python. The two versions are however indistinguishable for the theoretical engine. A bientôt, Armin.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4