> Guido van Rossum <guido@python.org> writes: > > > >>> u = u'\u1f40' > > >>> s = u.encode('utf8') > > >>> s > > 'a=\x80' > > >>> > > > > The latter output is not helpful, because the encoding of s is not the > > locale's encoding. [Martin] > [Somehow, the accents got lost in your message] > > It isn't helpful, but it isn't strictly wrong, either. In this > specific case, people are used to see utf-8 being interpreted as > Latin-1 - that form of "mojibake" is very common, so they will know > what happened. > > I question whether the hex representation is more helpful: it depends > on how you need to interpret the result you get. Well, if you *want* to see the hex codes for all non-ASCII characters, repr() used to be your friend. No more. If you *want* to see the printable characters, you could always use print. I'd be okay with this change if the default locale wasn't changed by readline. Did you see my patch for that? Then people who want to see their encoding from repr() can learn to put import locale locale.setlocale(locale.LC_CTYPE, "") in their $PYTHONSTARTUP file (or in their app's main()). --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4