A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2001-October/017690.html below:

[Python-Dev] is htmllib broken in 2.2a4?

[Python-Dev] is htmllib broken in 2.2a4?Skip Montanaro skip@pobox.com (Skip Montanaro)
Mon, 1 Oct 2001 21:52:58 -0500
Responding to a question in python-help about extracting links from web
pages, I wrote a simple href printer:

    import htmllib, formatter

    class MyParser(htmllib.HTMLParser):
        def anchor_bgn(self, href, name, type):
            print href

    fmt = formatter.NullFormatter()
    parser = MyParser(fmt, verbose=1)
    parser.feed(open("tour01.html").read())
    parser.close()

When run using 2.2a4, it never prints anything.  It outputs a list of hrefs
when run with 2.1 or 1.6.  Either there's a bug somewhere (in my code
possibly, though it's pretty simple) or some semantics changed that I
missed.

I thought maybe the method resolution order change affected things, but
htmllib.HTMLParser only uses single inheritance.  When displaying help about
htmllib.HTMLParser, pydoc.help does emit the method resolution order, which
it doesn't generally seem to do:

    class HTMLParser(sgmllib.SGMLParser)
     |  Method resolution order:
     |      HTMLParser
     |      sgmllib.SGMLParser
     |      markupbase.ParserBase
     ...

Skip




RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4