R. David Murray writes: > You transform *into* the encoding, and untransform *out* of the > encoding. Do you have an example where that would be ambiguous? In the bytes-to-bytes case, any pair of character encodings (eg, UTF-8 and ISO-8859-15) would do. Or how about in text, ReST to HTML? BASE64 itself is ambiguous. By RFC specification, BASE64 is a *textual* representation of arbitrary binary data. (Cf. URIs.) The natural interpretation of .encode('base64') in that context would be as a bytes-to-text encoder. However, this has several problems. In practice, we invariably use an ASCII octet stream to carry BASE64- encoded data. So web developers would almost certainly expect a bytes-to-bytes encoder. Such a bytes-to-bytes encoder can't be duck-typed. Double-encoding bugs wouldn't be detected until the stream arrives at the user. And the RFC-based signature of .encode('base64') as bytes-to-text is precisely opposite to that of .encode('utf-8') (text-to-bytes). It is certainly true that there are many unambiguous cases. In the case of a true text processing facility (eg, Emacs buffers or Python 3 str) where there is an unambiguous text type with a constant and opaque internal representation, it makes a lot of sense to treat the text type as special/central, and use the terminology "encode [from text]" and "decode [to text]". It's easy to remember, which one is special is obvious, and the difference in input and output types means that mistaken use of the API will be detected by duck-typing. However, in the case of bytes-bytes or text-text transformations, it's not the presence of unambiguous cases that should drive API design IMO. It's the presence of the ambiguous cases that we should cater to. I don't see easy solutions to this issue. Steve
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4