Hi everybody, I've just uploaded a new Unicode snapshot. It includes a brand new UTF-16 codec which is BOM mark aware, meaning that it recognizes BOM marks on input and adjusts the byte order accordingly. On output you can choose to have BOM marks written or specifically define a byte order to use. Also new in this snapshot is configuration code which figures out the byte order on the installation machine... I looked everywhere in the Python source code but couldn't find any hint whether this was already done in some place, so I simply added some autoconf magic to have two new symbols defined: BYTEORDER_IS_LITTLE_ENDIAN and BYTEORDER_IS_BIG_ENDIAN (mutually exclusive of course). BTW, I changed the hash method of Unicode objects to use the UTF-8 string as basis for the hash code. This means that u'abc' and 'abc' will now be treated as the same dictionary key ! Some documentation also made into the snapshot. See the file Misc/unicode.txt for all the interesting details about the implementation. Note that the web page provides a prepatched version of the interpreter for your convenience... just download, run ./configure and make and your done. Could someone with access to a MS VC compiler please update the project files and perhaps post me some feedback about any glitches ?! I have never compiled Python on Windows myself and don't have the time to figure out just now :-/. Thanks :-) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4