RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2002-May/024193.html below:

[Python-Dev] getting the UCS-2 representation of a unicode object

[Python-Dev] getting the UCS-2 representation of a unicode object [Python-Dev] getting the UCS-2 representation of a unicode objectJohn Machin sjmachin@lexicon.net
Mon, 20 May 2002 10:22:39 +1000

Previous message: [Python-Dev] getting the UCS-2 representation of a unicode object
Next message: [Python-Dev] getting the UCS-2 representation of a unicode object
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

20/05/2002 12:35:19 AM, "Andreas Jung" <andreas@andreas-jung.com> wrote:

>Sounds reasonable..but since Py_ParseTuple() only applies to function
>arguments
>it can not be used to convert a unicode object to UCS-2. So what is the
>easiest
>way to get the UCS-2 representation? PyUnicode_AS_DATA() returns for
>u'computer'
>a char * with strlen()==1, however PyUnicode_GET_DATA_SIZE() on the
>same string returns 16 (looks fine for the two byes encoding of UCS-2). Am I
>missing
>something?
>

Andreas,

If you don't care about surrogates or weird things like the Hong Kong extended character set that are outside the 2**16 range, pretend UCS-2 == UTF-16. Then on a narrow Python build, the 
unicode object is in effect in UCS-2; no conversion required.

You are indeed missing something about PyUnicode_AS_DATA -- the doc says it returns a char * pointer to the internal buffer. I can't imagine what relevance strlen(such_a_pointer) has. The 
buffer will contain "c\0o\0m\0 etc etc" when viewed as a series of bytes (on a little-endian box) so yes strlen -> 1 but so what?

What is there about the PyUnicode_AS_UNICODE() function that you don't like?

Perhaps you might like to (a) say what you are trying to achieve (b) move the discussion to c.l.py

Regards,

John

Previous message: [Python-Dev] getting the UCS-2 representation of a unicode object
Next message: [Python-Dev] getting the UCS-2 representation of a unicode object
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4