A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2002-April/022899.html below:

[Python-Dev] Re: Regression in unicodestr.encode()?

[Python-Dev] Re: Regression in unicodestr.encode()?Guido van Rossum guido@python.org
Tue, 09 Apr 2002 20:50:23 -0400
> [Guido van Rossum]
> > Hm, but isn't there a way to encode a NUL that doesn't produce a NUL?
> > In some variant?

[François]
> There is also a rule about the shortest coding.  It is invalid UTF-8
> to use more bytes than required, and a given UCS character has a
> unique UTF-8 representation.  Moreover, decoders should raise an
> exception on non-minimal UTF-8 codings, and I do not know how Python
> behaves with this.  The Gambit author once told me he found a way to
> implement the test very efficiently.
> 
> One could use multi-byte sequences, that is, a sequence having no NULs,
> that would fool a lazy UTF-8 decoder into producing a NUL.  But for this,
> one has to break the shortest coding rule, and start from invalid UTF-8.

I knew all that, but I thought I'd read about a hack to encode NUL
using c0 80, specifically to get around the limitation on encoded
strings containing a NUL.  But I can't find the reference so I'll shut
up.

--Guido van Rossum (home page: http://www.python.org/~guido/)




RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4