... [Tim] > It's the last line in the loop body that makes empty matches > a wart if allowed: they wouldn't advance the position at all, and an > infinite loop would result. In order to make them do what you think you > want, we'd have to add, at the end of the loop body > > ah, and if the match was emtpy, advance the position again, by, oh, > i don't know, how about 1? That's close to 0 <wink>. [Andrew Koenig] > Indeed, that's an arbitrary rule -- just about as arbitrary as the one > that you abbreviated above, which should really be > > find the next match, but if the match is empty, disregard it; > instead, find the next match with a length of at least, > oh, I don't know, how about 1? That's close to 0 <wink>. You really think so? I expect almost all programmers would understand what "find next non-empty match" means at first glance -- and especially regexp-slingers, who are often burned in their matching lives by the consequences of having large pieces of their patterns unexpectedly match an empty string. That makes "non-empty match" seem a natural concept to me. > What I'm trying to do is come up with a useful example to convince > myself that one is better than the other. Have you found one yet? I confess that re.findall() implements a "if the match was empty, advance the position by 1" rule, as in >>> re.findall("x?", "abc") ['', '', '', ''] >>> But I don't think we're doing anyone a favor with stuff like that. I think it's a dubious idea that >>> "abc".find('') 0 >>> "works" too. If a program does s1.find(s2) and s2 is an empty string, I expect the chances are good it's a logic error in the program. Analogies to, e.g., i+j when j happens to be 0 leave me cold, since I can think of a thousand reasons for why j might naturally be 0. But I've had a hard time thinking of a reasonable algorithm where the expression s1.find(s2) could be expected to have s2=="" in normal operation (and am sure it would have been a logic error elsewhere in any uses of string.find() I've made; ditto searching for, or splitting on, empty strings via regexps).
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4