A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2002-August/028381.html below:

Your message to Python-Dev awaits moderator approval

[Python-Dev] FW: Your message to Python-Dev awaits moderator approval [Python-Dev] FW: Your message to Python-Dev awaits moderator approvalSkip Montanaro skip@pobox.com
Tue, 27 Aug 2002 23:04:41 -0500
    Tim> FYI, here's the closest thing to a real false positive I've seen so
    Tim> far:

I have much smaller spam and ham corpora (currently about 400 msgs each),
but both consist only of messages sent to me in the past couple weeks
(though not all messages sent during that interval), so some of the header
clues which skewed Tim's tests shouldn't be present.  Using my currently
undeleted Python mail as "unknown" (but which doesn't actually contain any
spam), I saw two false positives.  One had an attached gif image.  The other
was a one-line text+html message whose "words" were thus dominated by the
HTML tags in the second part.

Once my spam and ham grow to something more like 2000 each I will try Tim's
technique of splitting them into smaller chunks, training on one chunk, then
testing against the remaining chunks.

Skip



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4