On Fri, Jun 25, 2010 at 3:07 AM, P.J. Eby <pje at telecommunity.com> wrote: > (Btw, in some earlier emails, Stephen, you implied that this could be fixed > with codecs -- but it can't, because the problem isn't with the bytes > containing invalid Unicode, it's with the Unicode containing invalid bytes > -- i.e., characters that can't be encoded to the ultimate codec target.) That's what the surrogateescape error handler is for though - it will happily accept mojibake on input (putting invalid bytes into the PUA), and happily generate mojibake on output (recreating the invalid bytes from the PUA) as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4