20/05/2002 12:35:19 AM, "Andreas Jung" <andreas@andreas-jung.com> wrote: >Sounds reasonable..but since Py_ParseTuple() only applies to function >arguments >it can not be used to convert a unicode object to UCS-2. So what is the >easiest >way to get the UCS-2 representation? PyUnicode_AS_DATA() returns for >u'computer' >a char * with strlen()==1, however PyUnicode_GET_DATA_SIZE() on the >same string returns 16 (looks fine for the two byes encoding of UCS-2). Am I >missing >something? > Andreas, If you don't care about surrogates or weird things like the Hong Kong extended character set that are outside the 2**16 range, pretend UCS-2 == UTF-16. Then on a narrow Python build, the unicode object is in effect in UCS-2; no conversion required. You are indeed missing something about PyUnicode_AS_DATA -- the doc says it returns a char * pointer to the internal buffer. I can't imagine what relevance strlen(such_a_pointer) has. The buffer will contain "c\0o\0m\0 etc etc" when viewed as a series of bytes (on a little-endian box) so yes strlen -> 1 but so what? What is there about the PyUnicode_AS_UNICODE() function that you don't like? Perhaps you might like to (a) say what you are trying to achieve (b) move the discussion to c.l.py Regards, John
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4