A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2003-October/039629.html below:

[Python-Dev] htmllib vs. HTMLParser

[Python-Dev] htmllib vs. HTMLParserGuido van Rossum guido at python.org
Mon Oct 27 11:52:53 EST 2003
> Over in the Web SIG, it was noted that the HTML parser in htmllib has
> handlers for HTML 2.0 elements, and it should really support HTML 4.01, the
> current version.  I'm looking into doing this.
> 
> We actually have two HTML parsers: htmllib.py and the more recent
> HTMLParser.py.  The initial check-in comment for 2001/05/18 for
> HTMLParser.py reads:
> 
>       A much improved HTML parser -- a replacement for sgmllib.  The API is
>       derived from but not quite compatible with that of sgmllib, so it's a
>       new file.  I suppose it needs documentation, and htmllib needs to be
>       changed to use this instead of sgmllib, and sgmllib needs to be
>       declared obsolete.  But that can all be done later.
> 
> sgmllib only handles those bits of SGML needed for HTML, and anyone doing
> serious SGML work is going to have to use a real SGML parser, so deprecating 
> sgmllib is reasonable.  HTMLParser needs no changes for HTML 4.01; only
> htmllib needs to get a bunch more handler methods.
> 
> Should I try to do this for 2.4?

I'm unclear on what you plan to do -- repeal sgmllib an rewrite
htmllib to use HTMLParser internally for a backwards compatible
interface?

> (I can't find an explanation of how the API differs between the two modules
> but can figure it out by inspecting the code, and will try to keep the
> htmllib module backward-compatible.)

That would be required for a few releases, yes.

I'm okay with deprecating sgmllib faster than htmllib.

--Guido van Rossum (home page: http://www.python.org/~guido/)

More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4