RetroSearch Browse

Mon Jun 17 19:04:41 CEST 2013 · https://mail.python.org/pipermail/python-dev/2013-June/126906.html

On 14.06.13 23:03, PJ Eby wrote:
> On Fri, Jun 14, 2013 at 2:11 PM, Ron Adam <ron3200 at gmail.com> wrote:
>>
>>
>> On 06/14/2013 10:36 AM, Guido van Rossum wrote:
>>>
>>> Not a bug. The same is done for file input -- CRLF is changed to LF before
>>> tokenizing.
>>
>>
>>
>> Should this be the same?
>>
>>
>> python3 -c 'print(bytes("""\r\n""", "utf8"))'
>> b'\r\n'
>>
>>
>>>>> eval('print(bytes("""\r\n""", "utf8"))')
>> b'\n'
>
> No, but:
>
> eval(r'print(bytes("""\r\n""", "utf8"))')
>
> should be.  (And is.)
>
> What I believe you and Walter are missing is that the \r\n in the eval
> strings are converted early if you don't make the enclosing string
> raw.  So what you're eval-ing is not what you think you are eval-ing,
> hence the confusion.

I expected that eval()ing a string that contains the characters

    U+0027: APOSTROPHE
    U+0027: APOSTROPHE
    U+0027: APOSTROPHE
    U+000D: CR
    U+000A: LR
    U+0027: APOSTROPHE
    U+0027: APOSTROPHE
    U+0027: APOSTROPHE

to return a string containing the characters:

    U+000D: CR
    U+000A: LR

Making the string raw, of course turns it into:

    U+0027: APOSTROPHE
    U+0027: APOSTROPHE
    U+0027: APOSTROPHE
    U+005C: REVERSE SOLIDUS
    U+0072: LATIN SMALL LETTER R
    U+005C: REVERSE SOLIDUS
    U+006E: LATIN SMALL LETTER N
    U+0027: APOSTROPHE
    U+0027: APOSTROPHE
    U+0027: APOSTROPHE

and eval()ing that does indeed give "\r\n" as expected.

Hmm, it seems that codecs.unicode_escape_decode() does what I want:

 >>> codecs.unicode_escape_decode("\r\n\\r\\n\\x0d\\x0a\\u000d\\u000a")
('\r\n\r\n\r\n\r\n', 26)

Servus,
    Walter

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2013-June/126906.html below:

[Python-Dev] eval and triple quoted strings