On 2/14/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > Fred L. Drake, Jr. wrote: > > > The proper response in this case is often to re-start decoding > > with the correct encoding, since some of the data extracted so far may have > > been decoded incorrectly. > > If the protocol has been sensibly designed, that shouldn't > happen, since everything up to the coding marker should > be ascii (or some other protocol-defined initial coding). > > For protocols that are not sensibly designed (or if you're > just trying to guess) what you suggest may be needed. But > it would be good to have a nicer way of going about it > for when the protocol is sensible. I think that the implementation of encoding-guessing or auto-encoding-upgrade techniques should be left out of the standard library design for now. I know that XML does something like this, but fortunately we employ dedicated C code to parse XML so that particular case should be taken care of without complicating the rest of the standard I/O library. As far as searching bytes objects, that shouldn't be a problem as long as the search 'string' is also specified as a bytes object. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4