Guido> I haven't given up on the re module for fast scanners (see Tim's Guido> note on the speed of tokenizing 20,000 messages in mere minutes). Guido> Note that the Bayes approach doesn't *need* a trick to apply many Guido> regexes in parallel to the text. Right. I'm thinking of it in situations where you do need such tricks. SpamAssassin is one such place. I think Eric has an application (quickly tokenizing the data produced by an external program, where the data can run into several hundreds of thousands of lines) where this might be beneficial as well. Skip
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4