A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2002-April/022904.html below:

[Python-Dev] Re: Regression in unicodestr.encode()?

[Python-Dev] Re: Regression in unicodestr.encode()?Tim Peters tim.one@comcast.net
Tue, 09 Apr 2002 21:13:37 -0400
[Guido]
> I knew all that, but I thought I'd read about a hack to encode NUL
> using c0 80, specifically to get around the limitation on encoded
> strings containing a NUL.

Ah, that violates the "shortest encoding" rule, so is invalid UTF-8.  I'm
sure people have done it, though, and that many UTF-8 encoders accept it.
Python's doesn't:

>>> unicode('\xc0\x80', 'utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: UTF-8 decoding error: illegal encoding
>>>

Believe it or not, accepting non-shortest encodings is considered to be "a
security hole"(!).  That's a sad story of its own <wink> ...





RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4