A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2010-November/106110.html below:

[Python-Dev] Python and the Unicode Character Database

[Python-Dev] Python and the Unicode Character Database [Python-Dev] Python and the Unicode Character DatabaseAlexander Belopolsky alexander.belopolsky at gmail.com
Mon Nov 29 02:24:24 CET 2010
On Sun, Nov 28, 2010 at 7:55 PM, Ben Finney <ben+python at benfinney.id.au> wrote:
..
>> Of course it is fun that Python can process Bengali numerals, but so
>> would be allowing Roman numerals. There is a reason why after careful
>> consideration, PEP 313 was ultimately rejected.
>
> Rejecting a proposed *new* capability is a different matter from
> disabling an *existing* capability which works in existing Python
> releases.

Was this capability ever documented?  It does not feel like a
deliberate feature.  If it was, '\N{ARABIC DECIMAL SEPARATOR}' would
be accepted in arabic-indic notation.   If feels more like a CPython
implementation detail similar to say:

>>> int('10') is 10
True
>>> int('10000') is 10000
False

Note that the underlying PyUnicode_EncodeDecimal() function is
described in the unicodeobject.h header file as follows:

/* --- Decimal Encoder ---------------------------------------------------- */

/* Takes a Unicode string holding a decimal value and writes it into
   an output buffer using standard ASCII digit codes.
  ..
  The encoder converts whitespace to ' ', decimal characters to their
   corresponding ASCII digit and all other Latin-1 characters except
   \0 as-is. Characters outside this range (Unicode ordinals 1-256)
   are treated as errors. This includes embedded NULL bytes.
 */

So the support for non-ASCII digits is accidental and should be
treated as a bug.
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4