RetroSearch Browse

Tue Nov 3 13:41:04 CET 2009 · https://mail.python.org/pipermail/python-dev/2009-November/093596.html

On Mon, 2 Nov 2009 at 22:06, Guido van Rossum wrote:
> On Mon, Nov 2, 2009 at 9:51 PM, ssteinerX at gmail.com <ssteinerx at gmail.com> wrote:
>> BeautifulSoup, which I use every day, is one such product.  Since the crappy
>> old SMGL parser's gone, BeautifulSoup uses the one that's left in Python 3
>> and it makes BeautifulSoup completely useless for my daily work.
>
> This sounds an area where some help might be useful. Perhaps the
> quickest solution would simply be to copy the old crappy "sgml" based
> html parser into a new version of BeautifulSoup. Though I imagine what
> it really needs is a "quirks mode" parser that is compatible with the
> HTML dialect accepted by, say, IE6. Maybe a summer of code project?

It's not a matter of quirks.  It's a matter of being able to parse
truly broken html/xml, which browsers unfortunately do too well
for everyone else's sanity.

So, call it a "sloppy mode" parser, and then yes, that would solve the
problem.

--David (RDM)

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2009-November/093596.html below:

[Python-Dev] 2.7 Release? 2.7 == last of the 2.x line?