RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/2000-June/005170.html below:

[Python-Dev] SRE incompatibility

[Python-Dev] SRE incompatibilityAndrew Kuchling akuchlin@mems-exchange.org
Fri, 30 Jun 2000 10:29:00 -0400

Previous message: [Python-Dev] SRE incompatibility
Next message: [Python-Dev] SRE incompatibility
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Jun 30, 2000 at 04:18:13PM +0200, Fredrik Lundh wrote:
>    re.match('\\x00ffffffffffffff', '\377') != None
>or in other words, long hexadecimal escapes are cast
>down to 8-bit characters in RE.

This is for compatibility with Python string literals:

kronos Python-1.6>./python
>>> '\x00fffffff'
'\377'
>>> u'\x00fffffff'
u'\uFFFF'

(Where do these semantics come from, BTW?  C's \x seems to take any
number of hex digits but then reports an error if the character is
greater than 256, too large to fit into a byte.)

Note that the \u escape for Unicode characters uses exactly 4 digits,
no more, no less.  It would certainly be simpler and clearer to only
support a fixed number of digits with \x, since I find the casting
down behaviour is magical and not obvious.  But I don't know if we
want to make that change now.  (Guido now realizes the downside to
numbering it 2.0, as everyone hurries to suggest their favorite
backward-incompatible change.)

That doesn't help with regexes, of course, since a pattern might be
written as a regular string but be intended to match Unicode.  Maybe
the simplest rule is the best; always take 4 digits, even if it winds
up being incompatible with the \x in string literals.

--amk

Previous message: [Python-Dev] SRE incompatibility
Next message: [Python-Dev] SRE incompatibility
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4