I've just had the chance to examine the unicode implementation and was surprised by the size of the code introduced - not just by the size of the database extension module (which I understand Christian Tismer is optimizing and which I assume can be configured away), but in particular by the size of the additional objects (unicodeobject.c, unicodetype.c). These additional objects alone contribute approximately 100K to the resulting executable. On desktop systems, this is not of much concern and suggestions have been made previously to reduce this if necessary (shared extension modules and possibly a shared VM - libpython.so). However, on small embedded systems (eg, PalmIII), this additional code is tremendous. The current size of the python-1.5.2-pre-unicode VM (after removal of float and complex objects with more reductions to come) on the PalmIII is 240K (already huge by Palm standards). (For reference, the size of python-1.5.1 on the PalmIII is 160K, after removal of the compiler, parser, float/long/complex objects.) With the unicode additions, this value jumps to 340K. The upshot of this is that for small platforms on which I am working, unicode support will have to be removed. My immediated concern is that unicode is getting so embedded in python that it will be difficult to extract. The approach I've taken for removing "features" (like float objects): 1) removes the feature with WITHOUT_XXX #ifdef/#endif decorations, where XXX denotes the removable feature (configurable in config.h) 2) preserves the python API: builtin functions, C API, PyArg_Parse, print format specifiers, etc., raise MissingFeatureError if attempts are made to use them. Of course, the API associated with the removed feature is no longer present. 3) protects the reduced VM: all reads (via marshal, compile, etc.) involving source/compiled python code will fail with a MissingFeatureError if the reduced VM doesn't support it. 4) does not yet support a MissingFeatureError in the tokenizer if, say, 2.2 (for removed floats) is entered on the python command line. This instead results in a SyntaxError indicating a problem with the decimal point. It appears that another error token would have to be added to support this error. Of course, I may have missed something, but if the above appears to be a reasonable approach, I can supply patches (at least for floats and complexes) for further discussion. In the longer term, it would be helpful if developers would follow this (or a similar agreed upon approach) when adding new features. This would reduce the burden of maintaining python for small embedded platforms. Thanks, Jeff
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4