> Is there any sort of agreement that Python will use L"..." to denote > Unicode strings? I would be happy with it. I don't know of any agreement, but it makes sense. > Also, should: > print L"foo" -> 'foo' > and > print `L"foo"` -> L'foo' Yes, I think this should be the way. Exactly what happens to non-ASCII characters is up to the implementation. Do we have agreement on escapes like \xDDDD? Should \uDDDD be added? The difference between the two is that according to the ANSI C standard, which I follow rather strictly for string literals, '\xABCDEF' is a single character whose value is the lower bits (however many fit in a char) of 0xABCDEF; this makes it cumbersome to write a string consisting of a hex escape followed by a digit or letter a-f or A-F; you would have to use another hex escape or split the literal in two, like this: "\xABCD" "EF". (This is true for 8-bit chars as well as for long char in ANSI C.) The \u escape takes up to 4 bytes but is not ANSI C. In Java, \u has the additional funny property that it is recognized *everywhere* in the source code, not just in string literals, and I believe that this complicates the interpretation of things like "\\uffff" (is the \uffff interpreted before regular string \ processing happens?). I don't think we ought to copy this behavior, although JPython users or developers might disagree. (I don't know anyone who *uses* Unicode strings much, so it's hard to gauge the importance of these issues.) --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4