On Thu, Apr 25, 2013 at 7:43 AM, Antoine Pitrou <solipsis at pitrou.net> wrote: > On Thu, 25 Apr 2013 04:19:36 +0200 > Lennart Regebro <regebro at gmail.com> wrote: >> On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote: >> > RFC 4648 repeatedly refers to *characters*, without specifying an >> > encoding for them. > [...] >> >> Base64 is an encoding that transforms between 8-bit streams. > > No, it isn't. What Stephen wrote above. Yes it is. Base64 takes 8-bit bytes and transforms them into another 8-bit stream that can be safely transmitted over various channels that would mangle an unencoded 8-bit stream, such as email etc. http://en.wikipedia.org/wiki/Base64 >> Either you get a "LookupError: unknown >> encoding: base64", which is what you get now, or you get an >> UnicodeEncodingError if the text is not ASCII. We don't want the >> latter, because it means that code that looks fine for the developer >> breaks in real life because the developer was American > > That's bogus. No, that's real life. > By the same argument, we should suppress any > encoding which isn't able to represent all possible unicode strings. No, if you explicitly use such an encoding it is because you need to because you are transferring data to a system that needs the encoding in question. Unicode errors are unavoidable at that point, not an unexpected surprise because a conversion happened implicitly that you didn't know about. //Lennart
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4