__sitecustomize__
__sitecustomize__
directories__sitecustomize__
discovery__sitecustomize__
pth
files__sitecustomize__
sitecustomize.py
and usercustomize.py
This PEP proposes supporting extensible customization of the interpreter by allowing users to install files that will be executed at startup.
PEP RejectionPEP 648 was rejected by the steering council as it has a limited number of use cases and further complicates the startup sequence.
MotivationSystem administrators, tools that repackage the interpreter and some libraries need to customize aspects of the interpreter at startup time.
This is usually achieved via sitecustomize.py
for system administrators whilst libraries rely on exploiting pth
files. This PEP proposes a way of achieving the same functionality in a more user-friendly and structured way.
pth
files
If a library needs to perform any customization before an import or that relates to the general working of the interpreter, they often rely on the fact that pth
files, which are loaded at startup and implemented via the site module [7], can include Python code that will be executed when the pth
file is evaluated.
Note that pth
files were originally developed to just add additional directories to sys.path
, but they may also contain lines which start with “import”, which will be passed to exec()
. Users have exploited this feature to allow the customizations that they needed. See setuptools [4] or betterexceptions [5] as examples.
Using pth
files for this purpose is far from ideal for library developers, as they need to inject code into a single line preceded by an import, making it rather unreadable. Library developers following that practice will usually create a module that performs all actions on import, as done by betterexceptions [5], but the approach is still not really user friendly.
Additionally, it is also non-ideal for users of the interpreter if they want to inspect what is being executed at Python startup as they need to review all the pth
files for potential code execution which can be spread across all site paths. Most of those pth
files will be “legitimate” pth
files that just modify the path, answering the question of “what is changing my interpreter at startup” a rather complex one.
Lastly, there have been multiple suggestions for removing code execution from pth
files, see [1] and [2].
sitecustomize.py
Whilst sitecustomize is an acceptable solution, it assumes a single person is in charge of the system and the interpreter. If both the system administrator and the responsibility of provisioning the interpreter want to add customizations at the interpreter startup they need to agree on the contents of the file and combine all the changes. This is not a major limitation though, and it is not the main driver of this change. Should the change happen, it will also improve the situation for these users, as rather than having a sitecustomize.py
which performs all those actions, they can have custom isolated files named after the features they want to enhance. As an example, Ubuntu could change their current sitecustomize.py
to just be ubuntu_apport_python_hook
. This not only better represents its intent but also gives users of the interpreter a better understanding of the modifications happening on their interpreter.
This PEP proposes supporting extensible customization of the interpreter at startup by executing all files discovered in directories named __sitecustomize__
in sitepackages [8] or usersitepackages [9] at startup time.
__sitecustomize__
The name aims to follow the already existing concept of sitecustomize.py
. As the directory will be within sys.path
, given that it is located in site paths, we choose to use double underscore around its name, to prevent colliding with the already existing sitecustomize.py
.
__sitecustomize__
directories
The Python interpreter will look at startup for directory named __sitecustomize__
within any of the standard site-packages path.
These are commonly the Python system location and the user location, but are ultimately defined by the site module logic.
Users can use site.sitepackages
[8] and site.usersitepackages
[9] to know the paths where the interpreter can discover __sitecustomize__
directories.
__sitecustomize__
discovery
The __sitecustomize__
directories will be discovered exactly after pth
files are discovered in a site-packages path as part of site.addsitedir
[10].
These is repeated for each of the site-packages path in the exact same order that is being followed today for pth
files.
__sitecustomize__
The implementation will execute the files within __sitecustomize__
by sorting them by name when discovering each of the __sitecustomize__
directories. We discourage users to rely on the order of execution though.
We considered executing them in random order, but that could result in different results depending on how the interpreter chooses to pick up those files. So even if it won’t be a good practice to rely on other files being executed, we think that is better than having randomly different results on interpreter startup. We chose to run the files after the pth
files in case a user needs to add items to the path before running a files.
pth
files
pth
files can be used to add paths into sys.path
, but this should not affect the __sitecustomize__
discovery process, as those directories are looked up exclusively in site-packages paths.
__sitecustomize__
When a __sitecustomize__
directory is discovered, all of the files that have a .py
extension within it will be read with io.open_code
and executed by using exec
[11].
An empty dictionary will be passed as globals
to the exec
function to prevent unexpected interactions between different files.
Any error on the execution of any of the files will not be logged unless the interpreter is run in verbose mode and it should not stop the evaluation of other files. The user will receive a message in stderr saying that the file failed to be executed and that verbose mode can be used to get more information. This behaviour mimics the one existing for sitecustomize.py
.
The customizations applied to an interpreter via the new __sitecustomize__
solutions will continue to work when a user creates a virtual environment the same way that sitecustomize.py
interact with virtual environments.
This is a difference when compared to pth
files, which are not propagated into virtual environments unless include-system-site-packages
is enabled.
If library maintainers have features installed via __sitecustomize__
that they do not want to propagate into virtual environments, they should detect if they are running within a virtual environment by checking sys.prefix == sys.base_prefix
. This behavior is similar to packages that modify the global sitecustomize.py
.
sitecustomize.py
and usercustomize.py
Until removed, sitecustomize
and usercustomize
will be executed after __sitecustomize__
similar to pth files. See the Backward compatibility section for information on removal plans for sitecustomize
and usercustomize
.
To facilitate debugging of the Python startup, if the site module is invoked it will print the __sitecustomize__
directories that will be discovered on startup.
Packages will be encouraged to include the name of the package within the name of the file to avoid collisions between packages. But the only requirement on the filename is that it ends in .py
for the interpreter to execute them.
In some scenarios, like when the startup time is key, it might be desired to disable this option altogether. The already existing flag -S
[3] will disable all site
-related manipulation, including this new feature. If the flag is passed in, __sitecustomize__
directories will not be discovered.
Additionally, to allow for starting the interpreter disabling only this new feature a new option will be added under -X
: disablesitecustomize
, which will disable the discovery of __sitecustomize__
exclusively.
Lastly, the user can disable the discovery of __sitecustomize__
directories only in the user site by disabling the user site via any of the multiple options in the site.py
module.
Whilst build backends can choose to provide an option to facilitate the installation of these files into a __sitecustomize__
directory, this PEP does not address that directly. Similar to pth
files, build backends can choose to not provide an easy-to-configure mechanism for __sitecustomize__
files and let users hook into the installation process to include such files. We do not think build backends enhanced support as a requirement for this PEP.
A concern in this implementation is how Python interpreter startup time can be affected by this addition. We expect the performance impact to be highly coupled to the logic in the files that a user or sysadmin installs in the Python environment being tested.
If the interpreter has any files in their __sitecustomize__
directory, the file execution time plus a call reading the code will be added to the startup time. This is similar to how code execution is impacting startup time through sitecustomize.py
, usercustomize.py
and code in pth
files. We will therefore focus here on comparing this solution against those three, as otherwise the actual time added to startup is highly dependent on the code that is being executed in those files.
Results were gathered by running “./python.exe -c pass” with perf on 50 iterations, repeating 50 times the command on each iteration and getting the geometric mean of all the results. The file used to run those benchmarks is checked in in the reference implementation [6].
The benchmark was run with 3.10 alpha 7 compiled with PGO and LTO with the following parameters and system state:
The code placed to be executed in pth
files, sitecustomize.py
, usercustomize.py
and files within __sitecustomize__
is the following:
import time; x = time.time() ** 5
The file is aimed at execution a simple operation but still expected to be negligible. This is to put the experiment in a situation where we make visible any hit on performance due to the mechanism whilst still making it relatively realistic. Additionally, it starts with an import and is a single line to be able to be used in pth
files.
sitecustomize.py
usercustomize.py
pth
__sitecustomize__
Run 1 Run 2 1 0 0 0 Dir not created 13884 13897 2 0 0 0 0 13871 13818 3 0 0 1 0 13964 13924 4 0 0 0 1 13940 13939 5 1 1 0 0 13990 13993 6 0 0 0 2 (system + user) 14063 14040 7 0 0 50 0 16011 16014 8 0 0 0 50 15456 15448
Results can be reproduced with run-benchmark.py
script provided in the reference implementation [6].
We interpret the following from these results:
__sitecustomize__
scripts compared to sitecustomize.py
and usercustomize.py
slows down the interpreter by 0.3%. We expect this slowdown until sitecustomize.py
and usercustomize.py
are removed in a future release as even if the user does not create the files, the interpreter will still attempt to import them.__sitecustomize__
produces a speedup of ~3.5% in startup. Which is likely related to the simpler logic to evaluate __sitecustomize__
files compared to pth
file execution.A new audit event will be added and triggered on __sitecustomize__
execution to facilitate security inspection by calling sys.audit
[12] with “sitecustimze.exec_file” as name and the filename as argument.
This PEP aims to move all code execution from pth
files to files within a __sitecustomize__
directory. We think this is an improvement to system admins for the following reasons:
pth
files.__sitecustomize__
directory, potentially allowing users to install only packages that does not change the interpreter startup.In short, whilst this allows for a malicious users to drop a file that will be executed at startup, it’s an improvement compared to the existing pth
files.
This can be documented and taught as simple as saying that the interpreter will try to look for the __sitecustomize__
directory at startup in its site paths and if it finds any files with .py
extension, it will then execute it one by one.
For system administrators and tools that package the interpreter, we can now recommend placing files in __sitecustomize__
as they used to place sitecustomize.py
. Being more comfortable on that their content won’t be overridden by the next person, as they can provide with specific files to handle the logic they want to customize.
Library developers should be able to specify a new argument on tools like setuptools that will inject those new files. Something like sitecustomize_files=["scripts/betterexceptions.py"]
, which allows them to add those. Should the build backend not support that, they can manually install them as they used to do with pth
files. We will recommend them to include the name of the package as part of the file’s name.
This PEP adds a deprecation warning on sitecustomize.py
, usercustomize.py
and pth
code execution in 3.11, 3.12 and 3.13. With plans on removing those features by 3.14. The migration from those solutions to __sitecustomize__
should ideally be just moving the logic into a different file.
Whilst the existing sitecustomize.py
mechanism was created targeting System Administrators that placed it in a site path, the file could be actually placed anywhere in the path at the time that the interpreter was starting up. The new mechanism does not allow for users to place __sitecustomize__
directories anywhere in the path, but only in site paths. System administrators can recover a similar behavior to sitecustomize.py
by adding a custom file in __sitecustomize__
which just imports sitecustomize
as a migration path.
An initial implementation that passes the CPython test suite is available for evaluation [6].
This implementation is just for the reviewer to play with and check potential issues that this PEP could generate.
Rejected Ideas Do nothingWhilst the current status “works” it presents the issues listed in the motivation. After analyzing the impact of this change, we believe it is worth it, given the enhanced experience it brings.
Formalize usingpth
files
Another option would be to just glorify and document the usage of pth
files to inject code at startup code, but that is a suboptimal experience for users as listed in the motivation.
__sitecustomize__
a namespace package
We considered making the directory a namespace package and just import all the modules within it, which allowed searching across all paths in sys.path
at initialization time and provided a way to declare dependencies between files by importing each other. This was rejected for multiple reasons:
__init__.py
file in one of the locations.__sitecustomize__
as we are looking for pth
files already in the site paths compared to performing an actual import of a namespace package.init.d
users might be tempted to implement this feature in a way that users could also add code at shutdown, but extra support for that is not needed, as Python users can already do that via atexit
.
We considered extending the use of entry points to allow specifying files that should be executed at startup but we discarded that solution due to two main reasons. The first one being impact on startup time. This approach will require scanning all packages distribution information to just execute a handful of files. This has an impact on performance even if the user is not using the feature and such impact growths linearly with the number of packages installed in the environment. The second reason was that the proposed implementation in this PEP offers a single solution for startup customization for packages and system administrators. Additionally, if the main objective of entry points is to make it easy for libraries to install files at startup, that can still be added and make the build backends just install the files within the __sitecustomize__
directory.
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
AcknowledgementsThanks Pablo Galindo for contributing to this PEP and offering his PC to run the benchmark.
ReferencesRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4