A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2011-August/113040.html below:

[Python-Dev] PEP 393 Summer of Code Project

[Python-Dev] PEP 393 Summer of Code Project [Python-Dev] PEP 393 Summer of Code ProjectVictor Stinner victor.stinner at haypocalc.com
Wed Aug 24 20:00:45 CEST 2011
Le 24/08/2011 11:22, Glenn Linderman a écrit :
>>> c) mostly ASCII (utf8) with clever indexing/caching to be efficient
>>> d) UTF-8 with clever indexing/caching to be efficient
>> I see neither a need nor a means to consider these.
>
> The discussion about "mostly ASCII" strings seems convincing that there
> could be a significant space savings if such were implemented.

Antoine's optimization in the UTF-8 decoder has been removed. It doesn't 
change the memory footprint, it is just slower to create the Unicode object.

When you decode an UTF-8 string:

  - "abc" string uses "latin1" (8 bits) units
  - "aé" string uses "latin1" (8 bits) units <= cool!
  - "a€" string uses UCS2 (16 bits) units
  - "a\U0010FFFF" string uses UCS4 (32 bits) units

Victor
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4