On Feb 19, 2006, at 10:55 AM, Martin v. Löwis wrote: > Stephen J. Turnbull wrote: >> BTW, what use cases do you have in mind for Unicode -> Unicode >> decoding? > > I think "rot13" falls into that category: it is a transformation > on text, not on bytes. The current implementation is a transformation on bytes, not text. Conceptually though, it's a text->text transform. > For other "odd" cases: "base64" goes Unicode->bytes in the *decode* > direction, not in the encode direction. Some may argue that base64 > is bytes, not text, but in many applications, you can combine base64 > (or uuencode) with abitrary other text in a single stream. Of course, > it could be required that you go u.encode("ascii").decode("base64"). I would say that base64 is bytes->bytes. Just because those bytes happen to be in a subset of ASCII, it's still a serialization meant for wire transmission. Sometimes it ends up in unicode (e.g. in XML), but that's the exception not the rule. -bob
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4