RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/attachments/20120618/34aeedf1/attachment.html below:

<div class="gmail_quote">On Sun, Jun 17, 2012 at 10:59 PM, Terry Reedy <span dir="ltr"><<a href="mailto:tjreedy@udel.edu" target="_blank">tjreedy@udel.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im">On 6/17/2012 9:07 PM, Guido van Rossum wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Sun, Jun 17, 2012 at 4:55 PM, Nick Coghlan <<a href="mailto:ncoghlan@gmail.com" target="_blank">ncoghlan@gmail.com</a><br>
</blockquote>
<br>
</div><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
    So, perhaps the answer is to leave this as is, and try to make 2to3<br>
    smart enough to detect such escapes and replace them with their<br>
    properly encoded (according to the source code encoding) Unicode<br>
    equivalent?<br>
<br>
<br>
But the whole point of the reintroduction of u"..." is to support code<br>
that isn't run through 2to3.<br>
</blockquote>
<br></div>
People writing 2&3 code sometimes use 2to3 once (or a few times) on their 2.6/7 version during development to find things they must pay attention to. So Nick's idea could be helpful to people who do not want to use 2to3 routinely either in development or deployment.<div class="im">

<br>

<br>
> Frankly, I don't care how it's done, but<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I'd say it's important not to silently have different behavior for the<br>
same notation in the two versions.<br>
</blockquote>
<br></div>
The fundamental problem was giving the 'u' prefix two different meanings in 2.x: 'change the storage type from bytes to unicode', and 'change the contents by partially cooking the literal even when raw processing is requested'*. The only way to silently have the same behavior is to re-introduce the second meaning of partial cooking. (But I would rather make it unnecessary.) But that would freeze the 'u' prefix, or at least 'ur' ('un-raw') forever. It would be better to introduce a new, separate 'p' prefix, to mean partially raw, partially cooked. (But I am opposes to<br>

<br>

*I think this non-orthogonal interaction effect was a design mistake and that it would have been better to have re do all the cooking needed by also interpreting \u and \U sequences. I also think we should add this now for 3.3 if possible, to make partial cooking at the parsing stage unnecessary. Putting the processing in re makes it work for all strings, not just those given as literals.<div class="im">

<br>

<br>
> If that means we have to add an extra<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
step to the compiler to reject r"\u03b3", so be it.<br>
</blockquote>
<br></div>
I do not get this. Surely you cannot mean to suddenly start rejecting, in 3.3, a large set of perfectly legal and sensible 6 and 10 character sequences when embedded in literals?<br><div class="im"></div></blockquote><div class="im">

<br>Sorry, I meant rejecting ru"...." (and ur"....") if it contains a \u or \U escape that would be expanded by Python 2.<br><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hm. I still encounter enough environments that don't know how to display<br>
such characters that I would prefer to have a rock solid \u escape<br>
mechanism. I can think of two ways to support "expanded" unicode<br>
characters in raw strings a la Python 2;<br>
</blockquote>
<br>
(a) let the re module interpret the escapes (like it does for \r and \n);<br>
<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
As said above, I favor this. The 2.x partial cooking (with 'ur' prefix) was primarily a substitute for this.<div class="im"><br>
<br>
(b) the user can write r"someblah" "\u03b3" r"moreblah".<br>
<br></div>
This is somewhat orthogonal to (a). Users can this whenever they want partial processing of backslashes without doubling those they want left as is. A generic example is r'someraw' 'somecooked' r'moreraw' 'morecooked'.<span class="HOEnZb"><font color="#888888"><br>

<br>

-- <br>
Terry Jan Reedy</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
Python-Dev mailing list<br>
<a href="mailto:Python-Dev@python.org" target="_blank">Python-Dev@python.org</a><br>
<a href="http://mail.python.org/mailman/listinfo/python-dev" target="_blank">http://mail.python.org/<u></u>mailman/listinfo/python-dev</a><br>
Unsubscribe: <a href="http://mail.python.org/mailman/options/python-dev/guido%40python.org" target="_blank">http://mail.python.org/<u></u>mailman/options/python-dev/<u></u>guido%40python.org</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>--Guido van Rossum (<a href="http://python.org/~guido">python.org/~guido</a>)<br>

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4