RetroSearch Browse

Wed Jan 8 14:56:37 CET 2014 · https://mail.python.org/pipermail/python-dev/2014-January/131024.html

On Tue, Jan 7, 2014 at 10:36 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Daniel Holth writes:
>
>  > Isn't it true that if you have bytes > 127 or surrogate escapes then
>  > encoding to latin1 is no longer as fast as memcpy?
>
> Be careful.  As phrased, the question makes no sense.  You don't "have
> bytes" when you are encoding, you have characters.
>
> If you mean "what happens when my str contains characters in the range
> 128-255?", the answer is encoding a str in 8-bit representation to
> latin1 is effectively memcpy.  If you read in latin1, it's memcpy all
> the way (unless you combine it with a non-latin1 string, in which case
> you're in the cases below).
>
> If you mean "what happens when my str contains characters in the range
>> 255", you have to truncate 16-bit units to 8 bit units; no memcpy.
>
> Surrogates require >= 16 bits; no memcpy.

That is neat.

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2014-January/131024.html below:

[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5