A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2011-August/113137.html below:

[Python-Dev] PEP 393 Summer of Code Project

[Python-Dev] PEP 393 Summer of Code ProjectGuido van Rossum guido at python.org
Fri Aug 26 19:26:38 CEST 2011
On Fri, Aug 26, 2011 at 10:13 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 26 August 2011 18:02, Guido van Rossum <guido at python.org> wrote:
>
>> Eek. No, please. Those platforms' native string types have length and
>> slicing operations that are O(1) and work in terms of 16-bit code
>> points. Python should use those. It would be awful if Java and Python
>> code doing the same manipulations on the same string would come to
>> different conclusions because Python tried to paper over surrogates.
>
> *That* is actually the erroneous assumption I had made - that the Java
> and .NET native string type had code point semantics (i.e., took
> surrogates into account). As that isn't the case, my comments aren't
> valid - and I agree that having common semantics (and hence exposing
> surrogates) is too important to lose.

Those platforms probably *also* have libraries of operations to
support writing apps that conform to the Unicode standard. But those
apps will have to be aware of the difference between the "naive"
length of a string and the number of code points of characters in it.

> On the other hand, that pretty much establishes that whatever PEP 393
> achieves in terms of allowing all builds of CPython to offer code
> point semantics, the language definition can't mandate it.

The most severe consequence to me seems that the stdlib (which is
reused by those other platforms) cannot assume CPython's ideal world
-- even if specific apps sometimes can.

-- 
--Guido van Rossum (python.org/~guido)
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4