Stephen J. Turnbull wrote: > it does refer to *encoded* characters as the output of > the encoding process: > > > The encoding process represents 24-bit groups of input bits > > as output strings of 4 encoded characters. The "encoding" being referred to there is the encoding from input bytes to output characters, not an encoding of the output characters as bytes. Nowhere in RFC 4648 does it refer to the output as being made up of "bytes" or "octets". It's always described in terms of "characters". > As I understand it, the intention of the standard > in using "character" to denote the code unit is similar to that of RFC > 3986: BASE encodings are intended to be printable and recognizable to > humans. Hmmm... so why then does it say, in section 4: The Base 64 encoding is designed to represent arbitrary sequences of octets in a form that ... need not be human readable. > If you're using a non-ASCII-superset encoding such as EBCDIC > for text I/O, then you should translate from ASCII to that encoding > for display, What about the channel you're sending the encoded data over? Suppose I'm on Windows and I'm embedding the base64 encoded data in a text message that I'm sending through a mail client that accepts text in utf-16. I hope you would agree that, in that situation, encoding the base64 output in ASCII and giving those bytes directly to the mail client would be very much the wrong thing to do? -- Greg
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4