Macros in the C API will be converted to static inline functions or regular functions. This will help avoid macro pitfalls in C/C++, and make the functions usable from other programming languages.
To avoid compiler warnings, function arguments of pointer types will be cast to appropriate types using additional macros. The cast will not be done in the limited C API version 3.11: users who opt in to the new limited API may need to add casts to the exact expected type.
To avoid introducing incompatible changes, macros which can be used as l-value in an assignment will not be converted.
RationaleThe use of macros may have unintended adverse effects that are hard to avoid, even for experienced C developers. Some issues have been known for years, while others have been discovered recently in Python. Working around macro pitfalls makes the macro code harder to read and to maintain.
Converting macros to functions has multiple advantages:
Functions don’t need the following workarounds for macro pitfalls, making them usually easier to read and to maintain than similar macro code:
do { ... } while (0)
to write multiple statements.Converting macros and static inline functions to regular functions makes these regular functions accessible to projects which use Python but cannot use macros and static inline functions.
Specification Convert macros to static inline functionsMost macros will be converted to static inline functions.
The following macros will not be converted:
#define Py_HAVE_CONDVAR
.#define METH_VARARGS 0x0001
.Py_GCC_ATTRIBUTE()
, Py_ALWAYS_INLINE
, Py_MEMCPY()
.PyAPI_FUNC
, Py_DEPRECATED
, Py_PYTHON_H
.Py_STRINGIFY()
.Py_BEGIN_ALLOW_THREADS
(contains an unpaired }
), Py_VISIT
(relies on specific variable names), Py_RETURN_RICHCOMPARE (returns from the calling function).PyBytes_AS_STRING()
.Static inline functions in the public C API may be converted to regular functions, but only if there is no measurable performance impact of changing the function. The performance impact should be measured with benchmarks.
Cast pointer argumentsCurrently, most macros accepting pointers cast pointer arguments to their expected types. For example, in Python 3.6, the Py_TYPE()
macro casts its argument to PyObject*
:
#define Py_TYPE(ob) (((PyObject*)(ob))->ob_type)
The Py_TYPE()
macro accepts the PyObject*
type, but also any pointer types, such as PyLongObject*
and PyDictObject*
.
Functions are strongly typed, and can only accept one type of argument.
To avoid compiler errors and warnings in existing code, when a macro is converted to a function and the macro casts at least one of its arguments a new macro will be added to keep the cast. The new macro and the function will have the same name.
Example with the Py_TYPE()
macro converted to a static inline function:
static inline PyTypeObject* Py_TYPE(PyObject *ob) { return ob->ob_type; } #define Py_TYPE(ob) Py_TYPE((PyObject*)(ob))
The cast is kept for all pointer types, not only PyObject*
. This includes casts to void*
: removing a cast to void*
would emit a new warning if the function is called with a const void*
variable. For example, the PyUnicode_WRITE()
macro casts its data argument to void*
, and so it currently accepts const void*
type, even though it writes into data. This PEP will not change this.
The casts will be excluded from the limited C API version 3.11 and newer. When an API user opts into the new limited API, they must pass the expected type or perform the cast.
As an example, Py_TYPE()
will be defined like this:
static inline PyTypeObject* Py_TYPE(PyObject *ob) { return ob->ob_type; } #if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 < 0x030b0000 # define Py_TYPE(ob) Py_TYPE((PyObject*)(ob)) #endifReturn type is not changed
When a macro is converted to a function, its return type must not change to prevent emitting new compiler warnings.
For example, Python 3.7 changed the return type of PyUnicode_AsUTF8()
from char*
to const char*
(commit). The change emitted new compiler warnings when building C extensions expecting char*
. This PEP doesn’t change the return type to prevent this issue.
The PEP is designed to avoid C API incompatible changes.
Only C extensions explicitly targeting the limited C API version 3.11 must now pass the expected types to functions: pointer arguments are no longer cast to the expected types.
Function arguments of pointer types are still cast and return types are not changed to prevent emitting new compiler warnings.
Macros which can be used as l-value in an assignment are not modified by this PEP to avoid incompatible changes.
Examples of Macro Pitfalls Duplication of side effectsMacros:
#define PySet_Check(ob) \ (Py_IS_TYPE(ob, &PySet_Type) \ || PyType_IsSubtype(Py_TYPE(ob), &PySet_Type)) #define Py_IS_NAN(X) ((X) != (X))
If the op or the X argument has a side effect, the side effect is duplicated: it executed twice by PySet_Check()
and Py_IS_NAN()
.
For example, the pos++
argument in the PyUnicode_WRITE(kind, data, pos++, ch)
code has a side effect. This code is safe because the PyUnicode_WRITE()
macro only uses its 3rd argument once and so does not duplicate pos++
side effect.
Example of the bpo-43181: Python macros don’t shield arguments. The PyObject_TypeCheck()
macro before it has been fixed:
#define PyObject_TypeCheck(ob, tp) \ (Py_IS_TYPE(ob, tp) || PyType_IsSubtype(Py_TYPE(ob), (tp)))
C++ usage example:
PyObject_TypeCheck(ob, U(f<a,b>(c)))
The preprocessor first expands it:
(Py_IS_TYPE(ob, f<a,b>(c)) || ...)
C++ "<"
and ">"
characters are not treated as brackets by the preprocessor, so the Py_IS_TYPE()
macro is invoked with 3 arguments:
ob
f<a
b>(c)
The compilation fails with an error on Py_IS_TYPE()
which only takes 2 arguments.
The bug is that the op and tp arguments of PyObject_TypeCheck()
must be put between parentheses: replace Py_IS_TYPE(ob, tp)
with Py_IS_TYPE((ob), (tp))
. In regular C code, these parentheses are redundant, can be seen as a bug, and so are often forgotten when writing macros.
To avoid Macro Pitfalls, the PyObject_TypeCheck()
macro has been converted to a static inline function: commit.
Example showing the usage of commas in a macro which has a return value.
Python 3.7 macro:
#define PyObject_INIT(op, typeobj) \ ( Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), (op) )
Python 3.8 function (simplified code):
static inline PyObject* _PyObject_INIT(PyObject *op, PyTypeObject *typeobj) { Py_TYPE(op) = typeobj; _Py_NewReference(op); return op; } #define PyObject_INIT(op, typeobj) \ _PyObject_INIT(_PyObject_CAST(op), (typeobj))
"\"
."return op;"
rather than the surprising ", (op)"
syntax at the end of the macro.PyObject*
and so doesn’t need casts like (PyObject *)(op)
.typeobj
, rather than (typeobj)
.Example showing the usage of an #ifdef
inside a macro.
Python 3.7 macro (simplified code):
#ifdef COUNT_ALLOCS # define _Py_INC_TPALLOCS(OP) inc_count(Py_TYPE(OP)) # define _Py_COUNT_ALLOCS_COMMA , #else # define _Py_INC_TPALLOCS(OP) # define _Py_COUNT_ALLOCS_COMMA #endif /* COUNT_ALLOCS */ #define _Py_NewReference(op) ( \ _Py_INC_TPALLOCS(op) _Py_COUNT_ALLOCS_COMMA \ Py_REFCNT(op) = 1)
Python 3.8 function (simplified code):
static inline void _Py_NewReference(PyObject *op) { _Py_INC_TPALLOCS(op); Py_REFCNT(op) = 1; }PyUnicode_READ_CHAR()
This macro reuses arguments, and possibly calls PyUnicode_KIND
multiple times:
#define PyUnicode_READ_CHAR(unicode, index) \ (assert(PyUnicode_Check(unicode)), \ assert(PyUnicode_IS_READY(unicode)), \ (Py_UCS4) \ (PyUnicode_KIND((unicode)) == PyUnicode_1BYTE_KIND ? \ ((const Py_UCS1 *)(PyUnicode_DATA((unicode))))[(index)] : \ (PyUnicode_KIND((unicode)) == PyUnicode_2BYTE_KIND ? \ ((const Py_UCS2 *)(PyUnicode_DATA((unicode))))[(index)] : \ ((const Py_UCS4 *)(PyUnicode_DATA((unicode))))[(index)] \ ) \ ))
Possible implementation as a static inlined function:
static inline Py_UCS4 PyUnicode_READ_CHAR(PyObject *unicode, Py_ssize_t index) { assert(PyUnicode_Check(unicode)); assert(PyUnicode_IS_READY(unicode)); switch (PyUnicode_KIND(unicode)) { case PyUnicode_1BYTE_KIND: return (Py_UCS4)((const Py_UCS1 *)(PyUnicode_DATA(unicode)))[index]; case PyUnicode_2BYTE_KIND: return (Py_UCS4)((const Py_UCS2 *)(PyUnicode_DATA(unicode)))[index]; case PyUnicode_4BYTE_KIND: default: return (Py_UCS4)((const Py_UCS4 *)(PyUnicode_DATA(unicode)))[index]; } }Macros converted to functions since Python 3.8
This is a list of macros already converted to functions between Python 3.8 and Python 3.11. Even though some converted macros (like Py_INCREF()
) are very commonly used by C extensions, these conversions did not significantly impact Python performance and most of them didn’t break backward compatibility.
Python 3.8:
Py_DECREF()
Py_INCREF()
Py_XDECREF()
Py_XINCREF()
PyObject_INIT()
PyObject_INIT_VAR()
_PyObject_GC_UNTRACK()
_Py_Dealloc()
Python 3.9:
PyIndex_Check()
PyObject_CheckBuffer()
PyObject_GET_WEAKREFS_LISTPTR()
PyObject_IS_GC()
PyObject_NEW()
: alias to PyObject_New()
PyObject_NEW_VAR()
: alias to PyObjectVar_New()
To avoid performance slowdown on Python built without LTO, private static inline functions have been added to the internal C API:
_PyIndex_Check()
_PyObject_IS_GC()
_PyType_HasFeature()
_PyType_IS_GC()
Python 3.11:
PyObject_CallOneArg()
PyObject_Vectorcall()
PyVectorcall_Function()
_PyObject_FastCall()
To avoid performance slowdown on Python built without LTO, a private static inline function has been added to the internal C API:
_PyVectorcall_FunctionInline()
While other converted macros didn’t break the backward compatibility, there is an exception.
The 3 macros Py_REFCNT()
, Py_TYPE()
and Py_SIZE()
have been converted to static inline functions in Python 3.10 and 3.11 to disallow using them as l-value in assignment. It is an incompatible change made on purpose: see bpo-39573 for the rationale.
This PEP does not propose converting macros which can be used as l-value to avoid introducing new incompatible changes.
Performance concerns and benchmarksThere have been concerns that converting macros to functions can degrade performance.
This section explains performance concerns and shows benchmark results using PR 29728, which replaces the following static inline functions with macros:
PyObject_TypeCheck()
PyType_Check()
, PyType_CheckExact()
PyType_HasFeature()
PyVectorcall_NARGS()
Py_DECREF()
, Py_XDECREF()
Py_INCREF()
, Py_XINCREF()
Py_IS_TYPE()
Py_NewRef()
Py_REFCNT()
, Py_TYPE()
, Py_SIZE()
The benchmarks were run on Fedora 35 (Linux) with GCC 11 on a laptop with 8 logical CPUs (4 physical CPU cores).
Static inline functionsFirst of all, converting macros to static inline functions has negligible impact on performance: the measured differences are consistent with noise due to unrelated factors.
Static inline functions are a new feature in the C99 standard. Modern C compilers have efficient heuristics to decide if a function should be inlined or not.
When a C compiler decides to not inline, there is likely a good reason. For example, inlining would reuse a register which requires to save/restore the register value on the stack and so increases the stack memory usage, or be less efficient.
Benchmark of the ./python -m test -j5
command on Python built in release mode with gcc -O3
, LTO and PGO:
There is no significant performance difference between macros and static inline functions when static inline functions are inlined.
Debug buildPerformance in debug builds can suffer when macros are converted to functions. This is compensated by better debuggability: debuggers can retrieve function names, set breakpoints inside functions, etc.
On Windows, when Python is built in debug mode by Visual Studio, static inline functions are not inlined.
On other platforms, ./configure --with-pydebug
uses the -Og
compiler option on compilers that support it (including GCC and LLVM Clang). -Og
means “optimize debugging experience”. Otherwise, the -O0
compiler option is used. -O0
means “disable most optimizations”.
With GCC 11, gcc -Og
can inline static inline functions, whereas gcc -O0
does not inline static inline functions.
Benchmark of the ./python -m test -j10
command on Python built in debug mode with gcc -O0
(that is, compiler optimizations, including inlining, are explicitly disabled):
Replacing macros with static inline functions makes Python 1.04x slower when the compiler does not inline static inline functions.
Note that benchmarks should not be run on a Python debug build. Moreover, using link-time optimization (LTO) and profile-guided optimization (PGO) is recommended for best performance and reliable benchmarks. PGO helps the compiler to decide if functions should be inlined or not.
Force inliningThe Py_ALWAYS_INLINE
macro can be used to force inlining. This macro uses __attribute__((always_inline))
with GCC and Clang, and __forceinline
with MSC.
Previous attempts to use Py_ALWAYS_INLINE
didn’t show any benefit, and were abandoned. See for example bpo-45094 “Consider using __forceinline
and __attribute__((always_inline))
on static inline functions (Py_INCREF
, Py_TYPE
) for debug build”.
When the Py_INCREF()
macro was converted to a static inline function in 2018 (commit), it was decided not to force inlining. The machine code was analyzed with multiple C compilers and compiler options, and Py_INCREF()
was always inlined without having to force inlining. The only case where it was not inlined was the debug build. See discussion in bpo-35059 “Convert Py_INCREF()
and PyObject_INIT()
to inlined functions”.
On the other side, the Py_NO_INLINE
macro can be used to disable inlining. It can be used to reduce the stack memory usage, or to prevent inlining on LTO+PGO builds, which generally inline code more aggressively: see bpo-33720. The Py_NO_INLINE
macro uses __attribute__ ((noinline))
with GCC and Clang, and __declspec(noinline)
with MSC.
This technique is available, though we currently don’t know a concrete function for which it would be useful. Note that with macros, it is not possible to disable inlining at all.
Rejected Ideas Keep macros, but fix some macro issuesMacros are always “inlined” with any C compiler.
The duplication of side effects can be worked around in the caller of the macro.
People using macros should be considered “consenting adults”. People who feel unsafe with macros should simply not use them.
These ideas are rejected because macros are error prone, and it is too easy to miss a macro pitfall when writing and reviewing macro code. Moreover, macros are harder to read and maintain than functions.
Post Historypython-dev mailing list threads:
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4