Nick Coghlan writes: > On Tue, Jun 22, 2010 at 4:49 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote: > > > Which works if and only if your outputs are truly unicode-able. > > > > With PEP 383, they always are, as long as you allow Unicode to be > > decoded to the same garbage your bytes-based program would have > > produced anyway. > > Could it be that part of the problem here is that we need to better > advertise "errors='surrogateescape'" as a mechanism for decoding > incorrectly encoded data according to a nominal codec without throwing > UnicodeDecode and UnicodeEncode errors all over the place? Yes, I think that would make the "use str internally to urllib" strategy a lot more palatable. But it still needs to be combined with a program architecture of decode-process-encode, which might require substantial refactoring for some existing modules.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4