Folks, Here's a first stab at a PEP about controlling generation of bytecode files. Feedback appreciated. Skip ---------------------------------------------------------------------------- PEP: NNN Title: Controlling generation of bytecode files Version: $Revision: $ Last-Modified: $Date: $ Author: Skip Montanaro Status: Active Type: Draft Content-Type: text/x-rst Created: 22-Jan-2003 Post-History: Abstract ======== This PEP outlines a mechanism for controlling the generation and location of compiled Python bytecode files. This idea originally arose as a patch request [1]_ and evolved into a discussion thread on the python-dev mailing list [2]_. The introduction of an environment variable will allow people installing Python or Python-based third-party packages to control whether or not bytecode files should be generated, and if so, where they should be written. Proposal ======== Add a new environment variable, PYCROOT, to the mix of environment variables which Python understands. Its interpretation is: - If not present or present but with an empty string value, Python bytecode is generated in exactly the same way as is currently done. - If present and it refers to an existing directory, bytecode files are written into a directory structure rooted at that location. - If present and it does not refer to an existing directory, generation of bytecode files is suppressed altogether. sys.path is not modified. If PYCROOT is set and valid, during module lookup, the bytecode file will be looked for first in the same directory as the source file, then in the directory formed by prefixing the source file's directory with the PYCROOT directory, e.g., in a Unix environment: os.path.join(os.environ["PYCROOT"], os.path.split(sourcefile)[0]) (Under Windows the above operation, while conceptually similar, will almost certainly differ in detail.) Rationale ========= In many environments it is not possible for non-root users to write into the directory containing the source file. Most of the time, this is not a problem except for reduced performance. In some cases it can be an annoyance, if nothing else. [3]_ In other situations where bytecode files are writable, it can be a source of file corruption if multiple processes attempt to write the same bytecode file at the same time. [4]_ In environments with ramdisks available, it may be desirable from a performance standpoint to write bytecode files to a directory on such a disk. Alternatives ============ The only other alternative proposed so far [1]_ seems to be to add a -R flag to the interpreter to disable writing bytecode files altogether. This proposal subsumes that. Issues ====== - When looking for a bytecode file should the directory holding the source file be considered as well, or just the location implied by PYCROOT? If so, which should be searched first? It seems to me that if a module lives in /usr/local/lib/python2.3/mod.py and was installed by root without PYCROOT set, you'd want to use the bytecode file there if it was up-to-date without ever considering os.environ["PYCROOT"] + "/usr/local/lib/python2.3/". Only if you need to write out a bytecode file would anything turn up there. - Operation on multi-root file systems (e.g., Windows). On Windows each drive is fairly independent. If PYCROOT is set to C:\TEMP and a module is located in D:\PYTHON22\mod.py, where should the bytecode file be written? I think a scheme similar to what Cygwin uses (treat drive letters more-or-less as directory names) would work in practice, but I have no direct experience to draw on. The above might cause C:\TEMP\D\PYTHON22\mod.pyc to be written. What if PYCROOT doesn't include a drive letter? Perhaps the current drive at startup should be assumed. - Interpretation of a module's __file__ attribute. I believe the __file__ attribute of a module should reflect the true location of the bytecode file. If people want to locate a module's source code, they should use imp.find_module(module). - Security - What if root has PYCROOT set? Yes, this can present a security risk, but so can many things the root user does. The root user should probably not set PYCROOT except during installation. Still, perhaps this problem can be minimized. When running as root the interpreter should check to see if PYCROOT refers to a world-writable directory. If so, it could raise an exception or warning and reset PYCROOT to the empty string. Or, see the next item. - More security - What if PYCROOT refers to a general directory (say, /tmp)? In this case, perhaps loading of a preexisting bytecode file should occur only if the file is owned by the current user or root. (Does this matter on Windows?) - Runtime control - should there be a variable in sys (say, sys.pycroot) which takes on the value of PYCROOT (or an empty string or None) and which is modifiable on-the-fly? Should sys.pycroot be initialized from PYCROOT and then PYCROOT ignored (that is, what if they differ)? - Should there be a command-line flag for the interpreter instead of or in addition to an environment variable? This seems like it would be less flexible. During Python installation, the user frequently doesn't have ready access to the interpreter command line. Using an environment variable makes it easier to control behavior. - Should PYCROOT be interpreted differently during installation than at runtime? I have no idea. (Maybe it's just a stupid thought, but the thought occurred to me, so I thought I'd mention it.) Examples ======== In all the examples which follow, the urllib module is used as an example. Unless otherwise indicated, it lives in /usr/local/lib/python2.3/urllib.py and /usr/local/lib/python2.3 is not writable by the current, non-root user. - PYCROOT is set to /tmp. /usr/local/lib/python2.3/urllib.pyc exists, but is out-of-date. When urllib is imported, the generated bytecode file is written to /tmp/usr/local/lib/python2.3/urllib.pyc. Intermediate directories will be created as needed. - PYCROOT is not set. No urllib.pyc file is found. When urllib is imported, no bytecode file is written. - PYCROOT is set to /tmp. No urllib.pyc file is found. When urllib is imported, the generated bytecode file is written to /tmp/usr/local/lib/python2.3/urllib.pyc, again, creating intermediate directories as needed. References ========== .. [1] patch 602345, Option for not writing py.[co] files, Klose (http://www.python.org/sf/602345) .. [2] python-dev thread, Disable writing .py[co], Norwitz (http://mail.python.org/pipermail/python-dev/2003-January/032270.html) .. [3] Debian bug report, Mailman is writing to /usr in cron, Wegner (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=96111) .. [4] python-dev thread, Parallel pyc construction, Dubois (http://mail.python.org/pipermail/python-dev/2003-January/032060.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4