Hi all! I'm coding a small blogging system (for my own use) and I'd like to use html5lib to parse entries (before feeding them to a Genshi template for output). Future uses will include comments parsing and probably pingback/trackback, but I'm not there yet :-P
Entries are "HTML fragments" (much like those you can put in Atom in an Text Construct with type="html"), so I'd like to parse them as "innerHTML". Given Ian's proposal in <http://code.google.com/p/html5lib/issues/ detail?id=18> and the algorithm for setting innerHTML in HTML documents <http://www.whatwg.org/specs/web-apps/current-work/ #innerhtml0>, I'd like to suggest the following design change before the "innerHTML case" is implemented: Instead of having the innerHTML argument to the HTMLParser.parse method be a boolean, could we make it a string being the name of the element we're "setting the innerHTML property". If it evaluates to False (either being False, None or an empty string) then consider we're not in the "innerHTML case" (that shouldn't change the current asserts in the code). Ian's suggestion in issue #18 would read: myFragment = parser.parse(s, innerHTML='div') The problem might come when returning the parsed fragment: TreeBuilder.getDocument is not adapted. So maybe it's worth dropping the innerHTML argument from parse() and adding a parseFragment(container, stream, encoding=None); also adding a getFragment() to TreeBuilder to return either the <html> element or a DocumentFragment (already exists in xml.dom.minidom even if not documented; easy to add to ElementTree, not talking about simpletree ;-) ) Any input? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send email to html5lib-discuss@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB -~----------~----~----~----~------~----~------~--~---
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4