A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2011-July/112659.html below:

[Python-Dev] HTMLParser and HTML5

[Python-Dev] HTMLParser and HTML5Glyph Lefkowitz glyph at twistedmatrix.com
Fri Jul 29 22:16:07 CEST 2011
On Jul 29, 2011, at 3:00 PM, Matt wrote:

> I don't see any real reason to drop a decent piece of code (HTMLParser, that is) in favor of a third party library when only relatively minor updates are needed to bring it up to speed with the latest spec.

I am not really one to throw stones here, as Twisted contains a lenient pseudo-XML parser which I still maintain - one which decidedly does not agree with html5's requirements for dealing with invalid data, but just a bunch of ad-hoc guesses of my own.

My impression of HTML5 is that HTMLParser would require significant modifications and possibly a drastic re-architecture in order to really do HTML5 "right"; especially the parts that the html5lib authors claim makes HTML5 streaming-unfriendly, i.e. subtree reordering when encountering certain types of invalid data.

But if I'm wrong about that, and there are just a few spec updates and bugfixes that need to be applied, by all means, ignore my comment.

-glyph


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110729/c07ae689/attachment-0001.html>
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4