On Mon, 15 Nov 1999 20:20:55 +0100, you wrote: >These are all great ideas, but I think they unnecessarily >complicate the proposal. However, to claim that Python is properly internationalized, we will need a large number of multi-byte encodings to be available. It's a large amount of work, it must be provably correct, and someone's going to have to do it. So if anyone with more C expertise than me - not hard :-) - is interested I'm not suggesting putting my points in the Unicode proposal - in fact, I'm very happy we have a proposal which allows for extension, and lets us work on the encodings separately (and later). >Since Codecs can be registered at runtime, there is quite >some potential there for extension writers coding their >own fast codecs. E.g. one could use mxTextTools as codec >engine working at C speeds. Exactly my thoughts , although I was thinking of a more slimmed down and specialized one. The right tool might be usable for things like compression algorithms too. Separate project to the Unicode stuff, but if anyone is interested, talk to me. >I would propose to only add some very basic encodings to >the standard distribution, e.g. the ones mentioned under >Standard Codecs in the proposal: > > 'utf-8': 8-bit variable length encoding > 'utf-16': 16-bit variable length encoding (litte/big endian) > 'utf-16-le': utf-16 but explicitly little endian > 'utf-16-be': utf-16 but explicitly big endian > 'ascii': 7-bit ASCII codepage > 'latin-1': Latin-1 codepage > 'html-entities': Latin-1 + HTML entities; > see htmlentitydefs.py from the standard Pythin Lib > 'jis' (a popular version XXX): > Japanese character encoding > 'unicode-escape': See Unicode Constructors for a definition > 'native': Dump of the Internal Format used by Python > Leave JISXXX and the CJK stuff out. If you get into Japanese, you really need to cover ShiftJIS, EUC-JP and JIS, they are big, and there are lots of options about how to do it. The other ones are algorithmic and can be small and fast and fit into the core. Ditto with HTML, and maybe even escaped-unicode too. In summary, the current discussion is clearly doing the right things, but is only covering a small percentage of what needs to be done to internationalize Python fully. - Andy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4