A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-list/2001-April/109858.html below:

Regular Expression Question

Regular Expression QuestionKragen Sitaker kragen at dnaco.net
Tue Apr 3 20:03:25 EDT 2001
In article <3aca5b94_2 at news.nwlink.com>,
Wesley Witt <wesw at wittfamily.com> wrote:
>This is probably a simple question, but I can't seem to find the answer
>anywhere.
>
>I want a regular expression that will match ALL lines that do NOT contain
>the string "skip".  I have some backup logs that I need to filter the noise
>out of.

Don't do this.  Your successor maintainers will curse you, your boss
will fire you, and your dog will pee on you.

Just say:

for line in file.getlines():
    if string.find(line, 'skip') == -1: 
        outfile.write(line)

But if you're curious:

You can match a line not containing 's' simply: re.compile("^[^s]*$").

You can match a line not containing 'sk' with more difficulty:
re.compile("^([^s]|s+[^sk])*$")

'ski' is a little harder; I think there's an easier way to do this, but
I don't know what it is:
re.compile("^([^s]|(s(ks)*)+([^sk]|k[^is]))*$")

(I think there's an easier way because the above RE is not strictly
deterministic --- it has to push two states when it sees 'sk', one
for k's followed by s and one followed by [^is].)

All of these REs have a bug: if a prefix of the evil sequence occurs at
the end of a line, they fail.  I'm not sure how to fix that, and I
don't want to extend it to 'skip'.
-- 
<kragen at pobox.com>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
Perilous to all of us are the devices of an art deeper than we possess
ourselves.
       -- Gandalf the White [J.R.R. Tolkien, "The Two Towers", Bk 3, Ch. XI]


More information about the Python-list mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4