A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2003-February/033248.html below:

[Python-Dev] Grzegorz Adam Hankiewicz found a parsing bug in HTMLParser.

[Python-Dev] Grzegorz Adam Hankiewicz found a parsing bug in HTMLParser. [Python-Dev] Grzegorz Adam Hankiewicz found a parsing bug in HTMLParser.Guido van Rossum guido@python.org
Sun, 09 Feb 2003 20:12:47 -0500
> MSG-ID of the origin: <mailman.1044810540.18789.python-list@python.org>

Alas, that's not help ful in tracking down the original message.

> A bit of investigation showed that the bug exists because of that line:
> 
>         <a href="http://ss"title="pe">P</a>
>                          ^^^

Which is blatantly invalid HTML, of course.

> the place in code responsible for complaining that is a method
> check_for_whole_start_tag() of class HTMLParser, lines 308 to 312:
> 
>             if next in ("abcdefghijklmnopqrstuvwxyz=/"
>                         "ABCDEFGHIJKLMNOPQRSTUVWXYZ"):
>                 # end of input in or before attribute value, or we have the
>                 # '/' from a '/>' ending
>                 return -1
> 
> I don't want to change this since I'm sure, I'll make HTMLParser
> weak for some other conditions. Is there anybody who know the code
> for HTMLParser.py?

This isn't really the right forum for this, but I hope you can post
either a bug report or a patch to sourceforge.  If you need someone to
help investigate first, the right place to ask is comp.lang.python.

--Guido van Rossum (home page: http://www.python.org/~guido/)



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4