On Nov 3, 2009, at 12:06 AM, Guido van Rossum wrote: > Though I imagine what > it really needs is a "quirks mode" parser that is compatible with the > HTML dialect accepted by, say, IE6. Maybe a summer of code project? Already exists: html5lib. http://code.google.com/p/html5lib/ Or if you want a faster (yet I think less exact) HTML parser, libxml2's HTML parser, via lxml: http://codespeak.net/lxml/parsing.html#parsing-html James
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4