[MAL] > ... > The conversion goes as follows: > ยท for single characters (and this includes all \XXX sequences > except \uXXXX), take the ordinal and interpret it as Unicode > ordinal for \uXXXX sequences, insert the Unicode character > with ordinal 0xXXXX instead Perfect! [about "raw" Unicode strings] > ... > Not sure whether we really need to make this even more complicated... > The \uXXXX strings look ugly, adding a few \\\\ for e.g. REs or > filenames won't hurt much in the context of those \uXXXX monsters :-) Alas, this won't stand over the long term. Eventually people will write Python using nothing but Unicode strings -- "regular strings" will eventurally become a backward compatibility headache <0.7 wink>. IOW, Unicode regexps and Unicode docstrings and Unicode formatting ops ... nothing will escape. Nor should it. I don't think it all needs to be done at once, though -- existing languages usually take years to graft in gimmicks to cover all the fine points. So, happy to let raw Unicode strings pass for now, as a relatively minor point, but without agreeing it can be ignored forever. > ... > BTW, if you want to type in UTF-8 strings and have them converted > to Unicode, you can use the standard: > > u = unicode('...string with UTF-8 encoded characters...','utf-8') That's what I figured, and thanks for the confirmation.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4