This PEP proposes extending the existing mechanism for setting up sys.path
to include a new __pypackages__
directory, in addition to the existing locations. The new directory will be added at the start of sys.path
, after the current working directory and just before the system site-packages, to give packages installed there priority over other locations.
This is similar to the existing mechanism of adding the current directory (or the directory the script is located in), but by using a subdirectory, additional libraries are kept separate from the user’s work.
MotivationNew Python programmers can benefit from being taught the value of isolating an individual project’s dependencies from their system environment. However, the existing mechanism for doing this, virtual environments, is known to be complex and error-prone for beginners to understand. Explaining virtual environments is often a distraction when trying to get a group of beginners set up - differences in platform and shell environments require individual assistance, and the need for activation in every new shell session makes it easy for students to make mistakes when coming back to work after a break. This proposal offers a lightweight solution that gives isolation without the user needing to understand more advanced concepts.
Furthermore, standalone Python applications usually need 3rd party libraries to function. Typically, they are either designed to be run from a virtual environment, where the dependencies are installed into the environment alongside the application, or they bundle their dependencies in a subdirectory, and modify sys.path
at application startup. Virtual environments, while a common and effective solution (used, for example, by the pipx
tool), are somewhat awkward to set up and manage, and are not relocatable. On the other hand, manual manipulation of sys.path
is boilerplate that developers need to get right, and (being a runtime behaviour) it is not understood by tools like linters and type checkers. The __pypackages__
proposal formalises the idea of a “bundled dependencies” location, avoiding the boilerplate and providing a standard location that development tools can be taught to recognise.
It should be noted that in general, Python libraries cannot be simply copied between machines, platforms, or even necessarily between Python versions. This proposal does nothing to change that fact, and while it is tempting to assume that bundling a script and its __pypackages__
is a mechanism for distributing applications, this is explicitly not a goal of this proposal. Developers remain responsible for the portability of their code.
While sys.path
can be manipulated at runtime, the default value is important, as it establishes a common baseline that users and tools can agree on. The current default does not include a location that could be viewed as “private to the current project”, and yet that is a useful concept.
This is similar to the npm node_modules
directory, which is popular in the Javascript community, and something that developers familiar with that ecosystem often ask for from Python.
This PEP proposes to add a new step in the process of calculating sys.path
at startup.
When the interactive interpreter starts, if a __pypackages__
directory is found in the current working directory, then it will be included in sys.path
after the entry for current working directory and just before the system site-packages.
When the interpreter runs a script, Python will try to find __pypackages__
in the same directory as the script. If found (along with the current Python version directory inside), then it will be used, otherwise Python will behave as it does currently.
The behaviour should work exactly the same as the way the existing mechanism for adding the current working directory or script directory to sys.path
works. For example, __pypackages__
will be ignored if the -P
option or the PYTHONSAFEPATH
environment variable is set.
In order to be recognised, the __pypackages__
directory must be laid out according to a new localpackages
scheme in the sysconfig module. Specifically, both of the purelib
and platlib
directories must be present, using the following code to determine the locations of those directories:
scheme = "localpackages" purelib = sysconfig.get_path("purelib", scheme, vars={"base": "__pypackages__", "platbase": "__pypackages__"}) platlib = sysconfig.get_path("platlib", scheme, vars={"base": "__pypackages__", "platbase": "__pypackages__"})
These two locations will be added to sys.path
, other directories or files in the __pypackages__
directory will be silently ignored. The paths will be based on Python versions.
Note
There is a possible option of having a separate new API, it is documented at issue #3013.
ExampleThe following shows an example project directory structure, and different ways the Python executable and any script will behave. The example is for Unix-like systems - on Windows the subdirectories will be different.
foo __pypackages__ lib python3.10 site-packages bottle myscript.py /> python foo/myscript.py sys.path[0] == 'foo' sys.path[1] == 'foo/__pypackages__/lib/python3.10/site-packages/' cd foo foo> /usr/bin/ansible #! /usr/bin/env python3 foo> python /usr/bin/ansible foo> python myscript.py foo> python sys.path[0] == '.' sys.path[1] == './__pypackages__/lib/python3.10/site-packages' foo> python -m bottle
We have a project directory called foo
and it has a __pypackages__
inside of it. We have bottle
installed in that __pypackages__/lib/python3.10/site-packages/
, and have a myscript.py
file inside of the project directory. We have used whatever tool we generally use to install bottle
in that location.
For invoking a script, Python will try to find a __pypackages__
inside of the directory that the script resides [1], /usr/bin
. The same will happen in case of the last example, where we are executing /usr/bin/ansible
from inside of the foo
directory. In both cases, it will not use the __pypackages__
in the current working directory.
Similarly, if we invoke myscript.py
from the first example, it will use the __pypackages__
directory that was in the foo
directory.
If we go inside of the foo
directory and start the Python executable (the interpreter), it will find the __pypackages__
directory inside of the current working directory and use it in the sys.path
. The same happens if we try to use the -m
and use a module. In our example, bottle
module will be found inside of the __pypackages__
directory.
The above two examples are only cases where __pypackages__
from current working directory is used.
In another example scenario, a trainer of a Python class can say “Today we are going to learn how to use Twisted! To start, please checkout our example project, go to that directory, and then run a given command to install Twisted.”
That will install Twisted into a directory separate from python3
. There’s no need to discuss virtual environments, global versus user installs, etc. as the install will be local by default. The trainer can then just keep telling them to use python3
without any activation step, etc.
At its heart, this proposal is simply to modify the calculation of the default value of sys.path
, and does not relate at all to the virtual environment mechanism. However, __pypackages__
can be viewed as providing an isolation capability, and in that sense, it “competes” with virtual environments.
However, there are significant differences:
- Virtual environments are isolated from the system environment, whereas
__pypackages__
simply adds to the system environment.- Virtual environments include a full “installation scheme”, with directories for binaries, C header files, etc., whereas
__pypackages__
is solely for Python library code.- Virtual environments work most smoothly when “activated”. This proposal needs no activation.
This proposal should be seen as independent of virtual environments, not competing with them. At best, some use cases currently only served by virtual environments can also be served (possibly better) by __pypackages__
.
It should be noted that libraries installed in __pypackages__
will be visible in a virtual environment. This arguably breaks the isolation of virtual environments, but it is no different in principle to the presence of the current directory on sys.path
(or mechanisms like the PYTHONPATH
environment variable). The only difference is in degree, as the expectation is that people will more commonly install packages in __pypackages__
. The alternative would be to explicitly detect virtual environments and disable __pypackages__
in that case - however that would break scripts with bundled dependencies. The PEP authors believe that developers using virtual environments should be experienced enough to understand the issue and anticipate and avoid any problems.
In theory, it is possible to add a library to the __pypackages__
directory that overrides a stdlib module or an installed 3rd party library. For the __pypackages__
associated with a script, this is assumed not to be a significant issue, as it is unlikely that anyone would be able to write to __pypackages__
unless they also had the ability to write to the script itself.
For a __pypackages__
directory in the current working directory, the interactive interpreter could be affected. However, this is not significantly different than the existing issue of someone having a math.py
module in their current directory, and while (just like that case) it can cause user confusion, it does not introduce any new security implications.
When running a script, any __pypackages__
directory in the current working directory is ignored. This is the same approach Python uses for adding the current working directory to sys.path
and ensures that it is not possible to change the behaviour of a script by modifying files in the current directory.
Also, a __pypackages__
directory is only recognised in the current (or script) directory. The interpreter will not scan for __pypackages__
in parent directories. Doing so would open up the risk of security issues if directory permissions on parents differ. In particular, scripts in the bin
directory or __pypackages__
(the scripts
location in sysconfig
terms) have no special access to the libraries installed in __pypackages__
. Putting executable scripts in a bin
directory is not supported by this proposal.
The original motivation for this proposal was to make it easier to teach Python to beginners. To that end, it needs to be easy to explain, and simple to use.
At the most basic level, this is similar to the existing mechanism where the script directory is added to sys.path
and can be taught in a similar manner. However, for its intended use of “lightweight isolation”, it would likely be taught in terms of “things you put in a __pypackages__
directory are private to your script”. The experience of the PEP authors suggests that this would be significantly easier to teach than the current alternative of introducing virtual environments.
As the intended use of the feature is to install 3rd party libraries in the new directory, it is important that tools, particularly installers, understand how to manage __pypackages__
.
It is hoped that tools will introduce a dedicated “pypackages” installation mode that is guaranteed to match the expected layout in all cases. However, the question of how best to support the __pypackages__
layout is ultimately left to individual tool maintainers to consider and decide on.
Tools that locate packages without actually running Python code (IDEs, linters, type checkers, etc.) would need updating to recognise __pypackages__
. In the absence of such updates, the __pypackages__
directory would work similarly to directories currently added to sys.path
at runtime (i.e., the tool would probably ignore it).
The directory name __pypackages__
was chosen because it is unlikely to be in common use. It is true that users who have chosen to use that name for their own purposes will be impacted, but at the time this PEP was written, this was viewed as a relatively low risk.
Unfortunately, in the time this PEP has been under discussion, a number of tools have chosen to implement variations on what is being proposed here, which are not all compatible with the final form of the PEP. As a result, the risk of clashes is now higher than originally anticipated.
It would be possible to mitigate this by choosing a different name, hopefully as uncommon as __pypackages__
originally was. But realistically, any compatibility issues can be viewed as simply the consequences of people trying to implement draft proposals, without making the effort to track changes in the proposal. As such, it seems reasonable to retain the __pypackages__
name, and put the burden of addressing the compatibility issue on the tools that implemented the draft version.
Other Python implementations will need to replicate the new behavior of the interpreter bootstrap, including locating the __pypackages__
directory and adding it the sys.path
just before site packages, if it is present. This is no different to any other Python change.
Here is a small script which will enable the implementation for Cpython
& in PyPy
.
__pylocal__
and python_modules
. Ultimately, the name is arbitrary and the chosen name is good enough.__pypackages__
. If we want to execute scripts inside of the ~/bin/
directory, then the __pypackages__
directory must be inside of the ~/bin/
directory. Doing any such scan for __pypackages__
(for the interpreter or a script) will have security implications and also increase startup time.__pypackages__
. This is considered too strict, particularly as transitional approaches like pip install --prefix
can create additional files in __pypackages__
.sysconfig
scheme, or a dedicated pypackages
scheme. While this is attractive in theory, it makes transition harder, as there will be no readily-available way of installing to __pypackages__
until tools implement explicit support. And while the PEP authors hope and assume that such support would be added, having the proposal dependent on such support in order to be usable seems like an unacceptable risk.This document has been placed in the public domain.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4