On Thu, 25 Apr 2013 04:19:36 +0200 Lennart Regebro <regebro at gmail.com> wrote: > On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote: > > RFC 4648 repeatedly refers to *characters*, without specifying an > > encoding for them. [...] > > Base64 is an encoding that transforms between 8-bit streams. No, it isn't. What Stephen wrote above. > Either you get a "LookupError: unknown > encoding: base64", which is what you get now, or you get an > UnicodeEncodingError if the text is not ASCII. We don't want the > latter, because it means that code that looks fine for the developer > breaks in real life because the developer was American That's bogus. By the same argument, we should suppress any encoding which isn't able to represent all possible unicode strings. That's almost all encodings provided by Python (including utf-8, if you consider lone surrogates). I'm sorry for Americans, but they *still* must know about character encodings, and be ready to handle UnicodeErrors, when using Python 3 for encoding/decoding bytestrings. There's no way around it. Regards Antoine.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4