>>> "Brad Clements" wrote > This is one way to do it, but I was planning on experimenting with tokenizer methods > that strip out HTML tags, leaving only the text. The set I'm working with, I found I needed to strip out everything but for src="" and href="" attributes of tags. Too much goodness in them for the system to get it's teeth into. > Tells me (spammer hat on) that I can send message with a non-spammish text > only part, and a spam html part since most "non-techie" email client users > automatically display the html version when available, however Tim's > implementation will ignore it. I've actually got a bunch of spam like that. The text/plain is something like **This is a HTML message** and nothing else. Anthony -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4