[Guido] > I knew all that, but I thought I'd read about a hack to encode NUL > using c0 80, specifically to get around the limitation on encoded > strings containing a NUL. Ah, that violates the "shortest encoding" rule, so is invalid UTF-8. I'm sure people have done it, though, and that many UTF-8 encoders accept it. Python's doesn't: >>> unicode('\xc0\x80', 'utf-8') Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeError: UTF-8 decoding error: illegal encoding >>> Believe it or not, accepting non-shortest encodings is considered to be "a security hole"(!). That's a sad story of its own <wink> ...
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4