Michael Urman wrote: > On Wed, May 6, 2009 at 15:42, "Martin v. Löwis" <martin at v.loewis.de> wrote: >> Despite there being also an error handler called "surrogates". > > Not that I have to be, but I'm not sold on the previous UTF-8 codec > behavior becoming an error handler of the name "surrogates" for two > reasons (I do respect the obvious PBP argument for the implementation, > and have no better name - "lenient"?). PBP? > First, unless there's a way to stack error handlers, there's no way to > access the old behavior combined with the "replace" handler. Well, there is a way to stack error handlers, although it's not pretty: _surrogates = codecs.lookup_errors("surrogates") _replace = codecs.lookup_errors("replace") def surrogates_then_replace(exc): try: return _surrogates(exc) except UnicodeError: return _replace(exc) codecs.register_error("surrogates_then_replace", surrogates_then_replace) > The stacking argument also applies to the new utf8b behavior on encode > (only, as it handles all errors on decode). This may be a YAGNI Indeed - in particular, as, in the primary application of this error handler (i.e. file IO operations), there is no way of specifying an addition error handler anyway. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4