Yesterday I ran into a bug in the C API docs. The top of this page: http://docs.python.org/api/unicodeObjects.html says: Py_UNICODE This type represents a 16-bit unsigned storage type which is used by Python internally as basis for holding Unicode ordinals. On platforms where wchar_t is available and also has 16-bits, Py_UNICODE is a typedef alias for wchar_t to enhance native platform compatibility. On all other platforms, Py_UNICODE is a typedef alias for unsigned short. This is incorrect on some platforms: on Debian, Py_UNICODE turns out to be 32 bits. I'm not sure what the correct quote should be: Does python use wchar_t whenever it's available (16 bits or not)? I solved my problem by realizing that I was going about things entirely wrong, and that I should use the python codecs from C and not worry about what Py_UNICODE contains. However, I think we should fix the docs to avoid confusing others... or maybe it would be better to document what's in Py_UNICODE and suggest always using the codec methods? I don't have a strong opinion either way. robey
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4