On Wed, 10 Nov 1999, Fredrik Lundh wrote: > Marc-Andre writes: > > The internal format for Unicode objects should either use a Python > specific fixed cross-platform format <PythonUnicode> (e.g. 2-byte > little endian byte order) or a compiler provided wchar_t format (if > available). Using the wchar_t format will ease embedding of Python in > other Unicode aware applications, but will also make internal format > dumps platform dependent. > > having been there and done that, I strongly suggest > a third option: a 16-bit unsigned integer, in platform > specific byte order (PY_UNICODE_T). along all other > roads lie code bloat and speed penalties... I agree 100% !! wchar_t will introduce portability issues right on up into the Python level. The byte-order introduces speed issues and OS interoperability issues, yet solves no portability problems (Byte Order Marks should still be present and used). There are two "platforms" out there that use Unicode: Win32 and Java. They both use UCS-2, AFAIK. Cheers, -g -- Greg Stein, http://www.lyra.org/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4