A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2002-July/026258.html below:

[Python-Dev] The C API and wide unicode support

[Python-Dev] The C API and wide unicode supportWalter Dörwald walter@livinglogic.de
Wed, 10 Jul 2002 18:00:17 +0200
Michael Hudson wrote:

> =?ISO-8859-15?Q?Walter_D=F6rwald?= <walter@livinglogic.de> writes:
> 
>>Guido van Rossum wrote:
>>
>>
>>>>Any C function that uses Unicode objects in any way needs name
>>>>mangling, because the storage layout of the Unicode objects
>>>>changes.
>>>
>>>
>>>Really?  If I am only using the published APIs and not peeking
>>>directly inside the Unicode object, why should I care about its
>>>internal lay-out?
>>
>>That's what I meant with "using". Function that only pass
>>unicode objects around don't need to know (as long as they pass
>>the objects only to functions that themselves either "know"
>>or "don't need to know" the layout).
>>
>>PyUnicode_Decode creates unicode objects, so I guess it needs
>>to know.
> 
> *It* needs to know, yes.  But surely the caller doesn't?

This depends on what the caller does with the result of
PyUnicode_Decode.

>>>Shouldn't only functions whose signature uses PY_UNICODE_TYPE be
>>>name-mangled?  What am I missing?
>>
>>What about the functions that use the C macros (PyUnicode_AS_UNICODE
>>etc.) directly or indirectly? Those functions will rely on the
>>internal lay-out.
> 
> They're verboten in extension modules anyway, so I don't care.

I didn't know that. Neither Include/unicodeobject.h nor
Doc/api/concrete.tex mention it. Is there any other location
where this is mentioned?

I think to forbid the use of the macros is too restrictive.
What if I want to implement a version of
    foo.replace(u"&", u"&amp;")
       .replace(u"<", u"&lt;")
       .replace(u"\"", u"&quot;")
       .replace(u">", u"&gt;")
in C for performance reasons? How is this possible without
using the C macros?

And if extension modules are not allowed to access the internal
layout of unicode objects, what's the use of name mangling?

Bye,
    Walter Dörwald





RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4