Zooko O'Whielacronx wrote: > Following-up to my own post to correct a major error: > Is it true that > srcbytes.encode(srcencoding, 'python-escape').decode('utf-8', > 'python-escape') will always produce srcbytes ? That is my Requirement If you start with bytes, decode with utf-8b to unicode (possibly 'invalid'), and encode the result back to bytes with utf-8b, you should get the original bytes, regardless of what they were. That is the point of PEP 383 -- to reliably roundtrip file 'names' that start as bytes and must end as the same bytes but which may not otherwise have a unicode decoding. If you start with invalid unicode text, encode to bytes with utf-8b, and decode back to unicode, you might instead get a different and valid unicode text. An example was given in the discussion. I believe this would be hard to avoid. An any case, it does not matter for the use case of starting with bytes that one wants to temporarily but surely work with as text. Terry Jan Reedy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4