Guido van Rossum <guido@python.org> writes: > >>> u = u'\u1f40' > >>> s = u.encode('utf8') > >>> s > 'a=\x80' > >>> > > The latter output is not helpful, because the encoding of s is not the > locale's encoding. [Somehow, the accents got lost in your message] It isn't helpful, but it isn't strictly wrong, either. In this specific case, people are used to see utf-8 being interpreted as Latin-1 - that form of "mojibake" is very common, so they will know what happened. I question whether the hex representation is more helpful: it depends on how you need to interpret the result you get. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4