A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2010-November/105965.html below:

[Python-Dev] len(chr(i)) = 2?

[Python-Dev] len(chr(i)) = 2?Antoine Pitrou solipsis at pitrou.net
Wed Nov 24 11:27:30 CET 2010
On Wed, 24 Nov 2010 18:51:49 +0900
"Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> James Y Knight writes:
> 
>  > But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly
>  > superior [...]a because it is an ASCII superset, and thus more
>  > easily compatible with other software. That also makes it most
>  > commonly used for internet communication.
> 
> Sure, UTF-8 is very nice as a protocol for communicating text.  So
> what?  If your application involves shoveling octets real fast, don't
> convert and shovel those octets.  If your application involves
> significant text processing, well, conversion can almost always be
> done as fast as you can do I/O so it doesn't cost wallclock time, and
> generally doesn't require a huge percentage of CPU time compared to
> the actual text processing.  It's just a specialization of
> serialization, that we do all the time for more complex data
> structures.
> 
> So wire protocols are not a killer argument for or against any
> particular internal representation of text.

Agreed. Decoding and encoding utf-8 is so fast that it should be
dwarfed by any actual processing done on the text.

Regards

Antoine.


More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4