| As far it it goes, yes. How would it learn? I have some ideas about how you could hook this into Mailman to do community/membership assisted learning. Understanding that people will be highly motivated to inform you about spam but not about good messages, you essentially queue a copy of a random sampling of messages for a few days. Members can let the list admin know about leaked spam (via a url or -spam address, or whatever) and after the list admin verifies it, this trains the system on that spam. If no feedback on a message happens after a few days, you train the system on that known good message. You need list admin verification to avoid attack vectors (I get mad at Guido so I -- a normal user -- label all his messages as spam). | On a more mundane note, I'd like to see decoding of base64 in it. | | (Oh, and on a blue-sky note, has anyone taken up Graham's suggestion | of having one of these things that looks at word pairs instead of | words?) | | It's neat that ESR saw immediately that the daemon should be | self-contained, no access to home directories. SpamAssassin doesn't | have a simple way of doing that, and [ISP] is modifying it to have | one -- and you wouldn't believe the resistance to the proposed | changes from some of the SA developers. Some of them really seem | to think that it's better and simpler to store user configuration | in a database than to have the client send its config file to the | server along with each message. >>>>> "ZW" == Zack Weinberg <zack@codesourcery.com> writes: ZW> I remember you said you didn't want to do base64 decode ZW> because it was too slow? But there might be some interesting, integrated ways around that. Say for example, you take a Python-enabled mail server, parse the message into its decoded form early (but not before low level SMTP-based rejections) and then pass that parsed and decoded message object tree around to all the other subsystems that are interested, e.g. the Bayes filter, and Mailman. You can at least amortize the cost of parsing and decoding once for the rest of the lifetime of that message on your system. I think we have all the pieces in place to play with this approach on python.org. -Barry
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4