Tim Peters wrote: > > [MAL] > > ... > > The conversion goes as follows: > > ยท for single characters (and this includes all \XXX sequences > > except \uXXXX), take the ordinal and interpret it as Unicode > > ordinal for \uXXXX sequences, insert the Unicode character > > with ordinal 0xXXXX instead > > Perfect! Thanks :-) > [about "raw" Unicode strings] > > ... > > Not sure whether we really need to make this even more complicated... > > The \uXXXX strings look ugly, adding a few \\\\ for e.g. REs or > > filenames won't hurt much in the context of those \uXXXX monsters :-) > > Alas, this won't stand over the long term. Eventually people will write > Python using nothing but Unicode strings -- "regular strings" will > eventurally become a backward compatibility headache <0.7 wink>. IOW, > Unicode regexps and Unicode docstrings and Unicode formatting ops ... > nothing will escape. Nor should it. > > I don't think it all needs to be done at once, though -- existing languages > usually take years to graft in gimmicks to cover all the fine points. So, > happy to let raw Unicode strings pass for now, as a relatively minor point, > but without agreeing it can be ignored forever. Agreed... note that you could also write your own codec for just this reason and then use: u = unicode('....\u1234...\...\...','raw-unicode-escaped') Put that into a function called 'ur' and you have: u = ur('...\u4545...\...\...') which is not that far away from ur'...' w/r to cosmetics. > > ... > > BTW, if you want to type in UTF-8 strings and have them converted > > to Unicode, you can use the standard: > > > > u = unicode('...string with UTF-8 encoded characters...','utf-8') > > That's what I figured, and thanks for the confirmation. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 49 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4