*For example*: (missing quotes): 1. <a href="http://www.somewebsite.com> some link </a> 2. <img src=">
It only happens when the input has nothing valid. If i replace example 2 with any of this: 1. <div> <img src="> </div> 2. <div> </div> <img src="> (it can be div, or anything else, as long as the malformed tag is not the only element) *Code fragment*: parser = html5lib.HTMLParser(tree=treebuilders.getTreeBuilder('dom'), tokenizer=sanitizer.HTMLSanitizer) sometree = parser.parseFragment(bad_html) walker = treewalkers.getTreeWalker('dom') stream = walker(sometree) s = serializer.htmlserializer.HTMLSerializer(quote_attr_values=True) nice_html = s.render(stream) <----*it fails here* *The question*: I would like to know if this is the expected behavior or i am doing something wrong. *Additional** info*: I'm using the lib for sanitizing user input. *Output*: File "somemodule.py", line 20, in somefunction nice_html = s.render(stream) File "some_env/local/lib/python2.7/site-packages/html5lib-0.95-py2.7.egg/html5lib/serializer/htmlserializer.py", line 302, in render return u"".join(list(self.serialize(treewalker))) File "some_env/local/lib/python2.7/site-packages/html5lib-0.95-py2.7.egg/html5lib/serializer/htmlserializer.py", line 192, in serialize for token in treewalker: File "some_env/local/lib/python2.7/site-packages/html5lib-0.95-py2.7.egg/html5lib/filters/optionaltags.py", line 15, in __iter__ type = token["type"] TypeError: 'NoneType' object has no attribute '__getitem__' -- You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To view this discussion on the web, visit https://groups.google.com/d/msg/html5lib-discuss/-/sSiTs1l1xNcJ. To post to this group, send an email to html5lib-discuss@googlegroups.com. To unsubscribe from this group, send email to html5lib-discuss+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4