Status: New Owner: ---- New issue 220 by st...@strassmann.com: cannot read html5lib in jython http://code.google.com/p/html5lib/issues/detail?id=220
What steps will reproduce the problem? Reproducible in Jython 2.5.2 and Jython 2.7b1
import html5lib
import html5lib Traceback (most recent call last): File "<stdin>", line 1, in <module> File "lib/html5lib/__init__.py", line 14, in <module> from html5parser import HTMLParser, parse, parseFragment File "lib/html5lib/html5parser.py", line 33, in <module> import inputstreamUnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 48-54: illegal Unicode character
What is the expected output? What do you see instead? jython cannot read inputstream.py. Please provide any additional information below.inputstream.py contains some seriously broken Unicode characters in the range 0xD800-0xDFFF, which are known as "unpaired surrogates".
This has been closed as wont-fix: http://bugs.jython.org/issue1836It may be necessary to modify inputstream.py to not use these unicode character literals when running in Jython.
n.b. a test for Jython: import platform JYTHON = (platform.system() == 'Java') --You received this message because this project is configured to send all issue notifications to this address.
You may adjust your notification preferences at: https://code.google.com/hosting/settings -- You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to html5lib-discuss+unsubscr...@googlegroups.com. To post to this group, send an email to html5lib-discuss@googlegroups.com. Visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB. For more options, visit https://groups.google.com/groups/opt_out.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4