[Andy writes:] > Leave JISXXX and the CJK stuff out. If you get into Japanese, you > really need to cover ShiftJIS, EUC-JP and JIS, they are big, and there [Then Marc relpies:] > 2. give more information to the unicodec registry: > one could register classes instead of instances which the Unicode [Jack chimes in with:] > I would suggest adding the Dos, Windows and Macintosh > standard 8-bit charsets > (their equivalents of latin-1) too, as documents in these > encoding are pretty > ubiquitous. But maybe these should only be added on the > respective platforms. [And the conversation twisted around to Greg noting:] > Next, the number of "open" calls: > > Solaris Linux IRIX > Perl 16 10 9 > Python 107 71 48 This is leading me to conclude that our "codec registry" should be the file system, and Python modules. Would it be possible to define a "standard package" called "encodings", and when we need an encoding, we simply attempt to load a module from that package? The key benefits I see are: * No need to load modules simply to register a codec (which would make the number of open calls even higher, and the startup time even slower.) This makes it truly demand-loading of the codecs, rather than explicit load-and-register. * Making language specific distributions becomes simple - simply select a different set of modules from the "encodings" directory. The Python source distribution has them all, but (say) the Windows binary installer selects only a few. The Japanese binary installer for Windows installs a few more. * Installing new codecs becomes trivial - no need to hack site.py etc - simply copy the new "codec module" to the encodings directory and you are done. * No serious problem for GMcM's installer nor for freeze We would probably need to assume that certain codes exist for _all_ platforms and language - but this is no different to assuming that "exceptions.py" also exists for all platforms. Is this worthy of consideration? Mark.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4