> I would propose to only add some very basic encodings to > the standard distribution, e.g. the ones mentioned under > Standard Codecs in the proposal: > > 'utf-8': 8-bit variable length encoding > 'utf-16': 16-bit variable length encoding (litte/big endian) > 'utf-16-le': utf-16 but explicitly little endian > 'utf-16-be': utf-16 but explicitly big endian > 'ascii': 7-bit ASCII codepage > 'latin-1': Latin-1 codepage > 'html-entities': Latin-1 + HTML entities; > see htmlentitydefs.py from the standard Pythin Lib > 'jis' (a popular version XXX): > Japanese character encoding > 'unicode-escape': See Unicode Constructors for a definition > 'native': Dump of the Internal Format used by Python I would suggest adding the Dos, Windows and Macintosh standard 8-bit charsets (their equivalents of latin-1) too, as documents in these encoding are pretty ubiquitous. But maybe these should only be added on the respective platforms. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4