The iterator returned by re.finditer appears to not terminate if the final match is empty, but rather keeps returning the final (empty) match. Is this a bug in _sre? If so, I'll be happy to file it, though fixing it is a bit beyond my _sre experience level at this point. The solution would appear to be to either a check for duplicate match in iterator.next(), or to increment position by one after returning an empty match (which should be OK, because if a non-empty match started at that location, we would have returned it instead of the empty match). Code to illustrate the failure: from re import finditer last = None for m in finditer( ".*", "asdf" ): if last == m.span(): print "duplicate match:", last break print m.group(), m.span() last = m.span() --- asdf (0, 4) (4, 4) duplicate match: (4, 4) --- findall works: print re.findall( ".*", "asdf" ) ['asdf', ''] Workaround is to explicitly check for a duplicate span, as I did above, or to check for a duplicate end(), which avoids the final empty match kb
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4