This PEP attempts to document the status of cross-compilation of downstream projects.
It should give an overview of the approaches currently used by distributors (Linux distros, WASM environment providers, etc.) to cross-compile downstream projects (3rd party extensions, etc.).
MotivationWe write this PEP to express the challenges in cross-compilation and act as a supporting document in future improvement proposals.
Analysis IntroductionThere are a couple different approaches being used to tackle this, with different levels of interaction required from the user, but they all require a significant amount of effort. This is due to the lack of standardized cross-compilation infrastructure on the Python packaging ecosystem, which itself stems from the complexity of cross-builds, making it a huge undertaking.
Upstream supportSome major projects like CPython, setuptools, etc. provide some support to help with cross-compilation, but it’s unofficial and at a best-effort basis. For example, the sysconfig
module allows overwriting the data module name via the _PYTHON_SYSCONFIGDATA_NAME
environment variable, something that is required for cross-builds, and setuptools accepts patches [1] to tweak/fix its logic to be compatible with popular “environment faking” workflows [2].
The lack of first-party support in upstream projects leads to cross-compilation being fragile and requiring a significant effort from users, but at the same time, the lack of standardization makes it harder for upstreams to improve support as there’s no clarity on how this feature should be provided.
Projects with decent cross-build supportIt seems relevant to point out that there are a few modern Python package build-backends with, at least, decent cross-compilation support, those being scikit-build and meson-python. Both these projects integrate external mature build-systems into Python packaging — CMake and Meson, respectively — so cross-build support is inherited from them.
Downstream approachesCross-compilation approaches fall in a spectrum that goes from, by design, requiring extensive user interaction to (ideally) almost none. Usually, they’ll be based on one of two main strategies, using a cross-build environment, or faking the target environment.
Cross-build environmentThis consists of running the Python interpreter normally and utilizing the cross-build provided by the projects’ build-system. However, as we saw above, upstream support is lacking, so this approach only works for a small-ish set of projects. When this fails, the usual strategy is to patch the build-system code to build use the correct toolchain, system details, etc. [3].
Since this approach often requires package-specific patching, it requires a lot of user interaction.
Faking the target environmentAiming to drop the requirement for user input, a popular approach is trying to fake the target environment. It generally consists of monkeypatching the Python interpreter to get it to mimic the interpreter on the target system, which constitutes of changing many of the sys
module attributes, the sysconfig
data, etc. Using this strategy, build-backends do not need to have any cross-build support, and should just work without any code changes.
Unfortunately, though, it isn’t possible to truly fake the target environment. There are many reasons for this, one of the main ones being that it breaks code that actually needs to introspect the running interpreter. As a result, monkeypatching Python to look like target is very tricky — to achieve the less amount of breakage, we can only patch certain aspects of the interpreter. Consequently, build-backends may need some code changes, but these are generally much smaller than the previous approach. This is an inherent limitation of the technique, meaning this strategy still requires some user interaction.
Nonetheless, this strategy still works out-of-the-box with significantly more projects than the approach above, and requires much less effort in these cases. It is successful in decreasing the amount of user interaction needed, even though it doesn’t succeed in being generic.
Environment introspectionAs explained above, most build system code is written with the assumption that the target system is the same as where the build is occurring, so introspection is usually used to guide the build.
In this section, we try to document most of the ways this is accomplished. It should give a decent overview of of environment details that are required by build systems.
Snippet Description Variance>>> importlib.machinery.EXTENSION_SUFFIXES [ '.cpython-311-x86_64-linux-gnu.so', '.abi3.so', '.so', ]Extension (native module) suffixes supported by this interpreter. This is implementation-defined, but it usually differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists.
>>> importlib.machinery.SOURCE_SUFFIXES ['.py']Source (pure-Python) suffixes supported by this interpreter. This is implementation-defined, but it usually doesn’t differ (outside exotic implementations or systems).
>>> importlib.machinery.all_suffixes() [ '.py', '.pyc', '.cpython-311-x86_64-linux-gnu.so', '.abi3.so', '.so', ]All module file suffixes supported by this interpreter. It should be the union of all
importlib.machinery.*_SUFFIXES
attributes. This is implementation-defined, but it usually differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. See the entries above for more information.
>>> sys.abiflags ''ABI flags, as specified in PEP 3149. Differs based on the build configuration.
>>> sys.api_version 1013C API version. Differs based on the Python installation.
>>> sys.base_prefix /usrPrefix of the installation-wide directories where platform independent files are installed. Differs based on the platform, and installation.
>>> sys.base_exec_prefix /usrPrefix of the installation-wide directories where platform dependent files are installed. Differs based on the platform, and installation.
>>> sys.byteorder 'little'Native byte order. Differs based on the platform.
>>> sys.builtin_module_names ('_abc', '_ast', '_codecs', ...)Names of all modules that are compiled into the Python interpreter. Differs based on the platform, system architecture, and build configuration.
>>> sys.exec_prefix /usrPrefix of the site-specific directories where platform independent files are installed. Because it concerns the site-specific directories, in standard virtual environment implementation, it will be a virtual-environment-specific path. Differs based on the platform, installation, and environment.
>>> sys.executable '/usr/bin/python'Path of the Python interpreter being used. Differs based on the installation.
>>> with open(sys.executable, 'rb') as f: ... header = f.read(4) ... if is_elf := (header == b'\x7fELF'): ... elf_class = int(f.read(1)) ... size = {1: 52, 2: 64}.get(elf_class) ... elf_header = f.read(size - 5)Whether the Python interpreter is an ELF file, and the ELF header. This approach is something used to identify the target architecture of the installation (example). Differs based on the installation.
>>> sys.float_info sys.float_info( max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1, )Low level information about the float type, as defined by
float.h
. Differs based on the architecture, and platform.
>>> sys.getandroidapilevel() 21Integer representing the Android API level. Differs based on the platform.
>>> sys.getwindowsversion() sys.getwindowsversion( major=10, minor=0, build=19045, platform=2, service_pack='', )Windows version of the system. Differs based on the platform.
>>> sys.hexversion 0x30b03f0Python version encoded as an integer. Differs based on the Python language version.
>>> sys.implementation namespace( name='cpython', cache_tag='cpython-311', version=sys.version_info( major=3, minor=11, micro=3, releaselevel='final', serial=0, ), hexversion=0x30b03f0, _multiarch='x86_64-linux-gnu', )Interpreter implementation details. Differs based on the interpreter implementation, Python language version, and implementation version — if one exists. It may also include architecture-dependent information, so it may also differ based on the system architecture.
>>> sys.int_info sys.int_info( bits_per_digit=30, sizeof_digit=4, default_max_str_digits=4300, str_digits_check_threshold=640, )Low level information about Python’s internal integer representation. Differs based on the architecture, platform, implementation, build, and runtime flags.
>>> sys.maxsize 0x7fffffffffffffffMaximum value a variable of type
Py_ssize_t
can take. Differs based on the architecture, platform, and implementation.
>>> sys.maxunicode 0x10ffffValue of the largest Unicode code point. Differs based on the implementation, and on Python versions older than 3.3, the build.
>>> sys.platform linuxPlatform identifier. Differs based on the platform.
>>> sys.prefix /usrPrefix of the site-specific directories where platform dependent files are installed. Because it concerns the site-specific directories, in standard virtual environment implementation, it will be a virtual-environment-specific path. Differs based on the platform, installation, and environment.
>>> sys.platlibdir libPlatform-specific library directory. Differs based on the platform, and vendor.
>>> sys.version_info sys.version_info( major=3, minor=11, micro=3, releaselevel='final', serial=0, )Python language version implemented by the interpreter. Differs if the target Python version is not the same [4].
>>> sys.thread_info sys.thread_info( name='pthread', lock='semaphore', version='NPTL 2.37', )Information about the thread implementation. Differs based on the platform, and implementation.
>>> sys.winver 3.8-32Version number used to form Windows registry keys. Differs based on the platform, and implementation.
>>> sysconfig.get_config_vars() { ... } >>> sysconfig.get_config_var(...) ...Python distribution configuration variables. It includes a set of variables [5] — like
prefix
, exec_prefix
, etc. — based on the running context [6], and may include some extra variables based on the Python implementation and system.
In CPython and most other implementations that use the same build-system, the “extra” variables mention above are: on POSIX, all variables from the Makefile
used to build the interpreter, and on Windows, it usually only includes a small subset of the those [7] — like EXT_SUFFIX
, BINDIR
, etc.
sysconfig
configuration variables Name Example Value Description Variance SOABI
cpython-311-x86_64-linux-gnu
ABI string — defined by PEP 3149. Differs based on the implementation, system architecture, Python language version, and implementation version — if one exists. SHLIB_SUFFIX
.so
Shared library suffix. Differs based on the platform. EXT_SUFFIX
.cpython-311-x86_64-linux-gnu.so
Interpreter-specific Python extension (native module) suffix — generally defined as .{SOABI}.{SHLIB_SUFFIX}
. Differs based on the implementation, system architecture, Python language version, and implementation version — if one exists. LDLIBRARY
libpython3.11.so
Shared libpython
library name — if available. If unavailable [8], the variable will be empty, if available, the library should be located in LIBDIR
. Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. PY3LIBRARY
libpython3.so
Shared Python 3 only (major version bound only) [9] libpython
library name — if available. If unavailable [8], the variable will be empty, if available, the library should be located in LIBDIR
. Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. LIBRARY
libpython3.11.a
Static libpython
library name — if available. If unavailable [8], the variable will be empty, if available, the library should be located in LIBDIR
. Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. Py_DEBUG
0
Whether this is a debug build. Differs based on the build configuration. WITH_PYMALLOC
1
Whether this build has pymalloc support. Differs based on the build configuration. Py_TRACE_REFS
0
Whether reference tracing (debug build only) is enabled. Differs based on the build configuration. Py_UNICODE_SIZE
Size of the Py_UNICODE
object, in bytes. This variable is only present in CPython versions older than 3.3, and was commonly used to detect if the build uses UCS2 or UCS4 for unicode objects — before PEP 393. Differs based on the build configuration. Py_ENABLE_SHARED
1
Whether a shared libpython
is available. Differs based on the build configuration. PY_ENABLE_SHARED
1
Whether a shared libpython
is available. Differs based on the build configuration. CC
gcc
The C compiler used to build the Python distribution. Differs based on the build configuration. CXX
g++
The C compiler used to build the Python distribution. Differs based on the build configuration. CFLAGS
-DNDEBUG -g -fwrapv ...
The C compiler flags used to build the Python distribution. Differs based on the build configuration. py_version
3.11.3
Full form of the Python version. Differs based on the Python language version. py_version_short
3.11
Custom form of the Python version, containing only the major and minor numbers. Differs based on the Python language version. py_version_nodot
311
Custom form of the Python version, containing only the major and minor numbers, and no dots. Differs based on the Python language version. prefix
/usr
Same as sys.prefix
, please refer to the entry in table above. Differs based on the platform, installation, and environment. base
/usr
Same as sys.prefix
, please refer to the entry in table above. Differs based on the platform, installation, and environment. exec_prefix
/usr
Same as sys.exec_prefix
, please refer to the entry in table above. Differs based on the platform, installation, and environment. platbase
/usr
Same as sys.exec_prefix
, please refer to the entry in table above. Differs based on the platform, installation, and environment. installed_base
/usr
Same as sys.base_prefix
, please refer to the entry in table above. Differs based on the platform, and installation. installed_platbase
/usr
Same as sys.base_exec_prefix
, please refer to the entry in table above. Differs based on the platform, and installation. platlibdir
lib
Same as sys.platlibdir
, please refer to the entry in table above. Differs based on the platform, and vendor. SIZEOF_*
4
Size of a certain C type (double
, float
, etc.). Differs based on the system architecture, and build details. Relevant Information
There are some bits of information required by build systems — eg. platform particularities — scattered across many places, and it often is difficult to identify code with assumptions based on them. In this section, we try to document the most relevant cases.
When should extensions be linked againstlibpython
?
When building extensions for dynamic loading, depending on the target platform, they may need to be linked against libpython
.
On Windows, extensions need to link against libpython
, because all symbols must be resolvable at link time. POSIX-like platforms based on Windows — like Cygwin, MinGW, or MSYS — will also require linking against libpython
.
On most POSIX platforms, it is not necessary to link against libpython
, as the symbols will already be available due to the interpreter — or, when embedding, the executable/library in question — already linking to libpython
. Not linking an extension module against libpython
will allow it to be loaded by static Python builds, so when possible, it is desirable to do so (see GH-65735).
This might not be the case on all POSIX platforms, so make sure you check. One example is Android, where only the main executable and LD_PRELOAD
entries are considered to be RTLD_GLOBAL
(meaning dependencies are RTLD_LOCAL
) [10], which causes the libpython
symbols be unavailable when loading the extension.
prefix
, exec_prefix
, base_prefix
, and base_exec_prefix
?
These are sys
attributes set in the Python initialization that describe the running environment. They refer to the prefix of directories where installation/environment files are installed, according to the table below.
prefix
platform independent (eg. pure Python) site-specific exec_prefix
platform dependent (eg. native code) site-specific base_prefix
platform independent (eg. pure Python) installation-wide base_exec_prefix
platform dependent (eg. native code) installation-wide
Because the site-specific prefixes will be different inside virtual environments, checking sys.prexix != sys.base_prefix
is commonly used to check if we are in a virtual environment.
crossenv
is a tool to create a virtual environment with a monkeypatched Python installation that tries to emulate the target machine in certain scenarios. More about this approach can be found in the Faking the target environment section.
XXX: Jaime will write a quick summary once the PEP draft is public.
XXX Uses a modified crossenv.
Yocto ProjectXXX: Sent email to the mailing list.
TODO
BuildrootTODO
PyodideXXX: Hood should review/expand this section.
Pyodide
is a provides a Python distribution compiled to WebAssembly using the Emscripten toolchain.
It patches several aspects of the CPython installation and some external components. A custom package manager — micropip — supporting both Pure and wasm32/Emscripten wheels, is also provided as a part of the distribution. On top of this, a repo with a selected set of 3rd party packages is also provided and enabled by default.
BeewareTODO
python-for-androidresource https://github.com/Android-for-Python/Android-for-Python-Users
python-for-android
is a tool to package Python apps on Android. It creates a Python distribution with your app and its dependencies.
Pure-Python dependencies are handled automatically and in a generic way, but native dependencies need recipes. A set of recipes for popular dependencies is provided, but users need to provide their own recipes for any other native dependencies.
kivy-ioskivy-ios
is a tool to package Python apps on iOS. It provides a toolchain to build a Python distribution with your app and its dependencies, as well as a CLI to create and manage Xcode projects that integrate with the toolchain.
It uses the same approach as python-for-android (also maintained by the Kivy project) for app dependencies — pure-Python dependencies are handled automatically, but native dependencies need recipes, and the project provides recipes for popular dependencies.
AidLearningTODO
QPythonTODO
pyqtdeploycontact https://www.riverbankcomputing.com/pipermail/pyqt/2023-May/thread.html contacted Phil, the maintainer
TODO
ChaquopyTODO
EDK IITODO
ActivePythonTODO
TermuxTODO
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4