Atsuo Ishimoto wrote: > On Wed, 22 Jan 2003 10:29:54 +0100 > "M.-A. Lemburg" <mal@lemburg.com> wrote: > >>The problem I see is size: Tamito's codecs have an installed >>size of 1790kB while Hisao's codecs are around 81kB. >> > > You cannot compare size of untared files here. I was talking about the *installed* size, ie. the size of the package in site-packages: degas site-packages/japanese# du 337 ./c 1252 ./mappings 88 ./python 8 ./aliases 1790 . Hisao's Python codec is only 85kB in size. Now, if we took the only the C version of Tamito's codec, we'd end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of 5... I wonder whether it wouldn't be possible to use the same tricks Hisao used in his codec for a C version. > Tamito's codecs package > contains source of C version and Python version. About 1 MB in 1790kB > is size of C sources. > > So, I'm proposing to add only C version of codec from JapaneseCodecs > package. As I mentioned, size of C version is about 160 KB in Win32 > binary form, excluding tests and documentations. I don't see a > significant difference between them. > > If size of C sources(about 1 MB) is matter, we may be able to reduce it. The source code size is not that important. The install size is and even more the memory footprint. Hisao's approach uses a single table which fits into 58kB Python source code. Boil that down to a static C table and you'll end up with something around 10-20kB for static C data. Hisao does still builds a dictionary using this data, but perhaps that step could be avoided using the same techniques that Fredrik used in boiling down the size of the unicodedata module (which holds the Unicode Database). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4