The elementtree treewalker seems to miss out the end tag for the root element:
In [565]: t = html5lib.HTMLParser(tree=html5lib.treebuilders.getTreeBuilder("etree", ElementTree)).parse("<html></html>") In [566]: twe = treewalkers.getTreeWalker("etree", ElementTree) In [567]: for item in twe(t): .....: print item .....: .....: {'data': [], 'type': 'StartTag', 'name': u'html'} {'data': [], 'type': 'StartTag', 'name': u'head'} {'data': [], 'type': 'EndTag', 'name': u'head'} {'data': [], 'type': 'StartTag', 'name': u'body'} {'data': [], 'type': 'EndTag', 'name': u'body'} (c.f. correct behavior for simpletree: In [569]: t = html5lib.HTMLParser(tree=html5lib.treebuilders.getTreeBuilder("simpletree")).parse("<html></html>") In [570]: tw = treewalkers.getTreeWalker("simpletree") In [571]: for item in tw(t): print item .....: .....: {'data': [], 'type': 'StartTag', 'name': u'html'} {'data': [], 'type': 'StartTag', 'name': u'head'} {'data': [], 'type': 'EndTag', 'name': u'head'} {'data': [], 'type': 'StartTag', 'name': u'body'} {'data': [], 'type': 'EndTag', 'name': u'body'} {'data': [], 'type': 'EndTag', 'name': u'html'} ) -- "Mixed up signals Bullet train People snuffed out in the brutal rain" --Conner Oberst --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send email to html5lib-discuss@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB -~----------~----~----~----~------~----~------~--~---
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4