On Nov 9, 2007 3:59 PM, M.-A. Lemburg <mal at egenix.com> wrote: > Martin v. Löwis wrote: > >> It makes working with XML data a lot easier: you simply don't have to > >> bother with the encoding of the XML data anymore and can just let the > >> codec figure out the details. The XML parser can then work directly > >> on the Unicode data. > > > > Having the functionality indeed makes things easier. However, I don't > > find > > > > s.decode(xml.detect_encoding(s)) > > > > particularly more difficult than > > > > s.decode("xml-auto-detection") > > Not really, but the codec has more control over what happens to > the stream, ie. it's easier to implement look-ahead in the codec > than to do the detection and then try to push the bytes back onto > the stream (which may or may not be possible depending on the > nature of the stream). io.BufferedReader() standardizes a .peek() API, making it trivial. I don't see why we couldn't require it. (As an aside, .peek() will fail to do what detect_encodings() needs if BufferedReader's buffer size is too small. I do wonder if that limitation is appropriate.) -- Adam Olsen, aka Rhamphoryncus
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4