Here is a script that does the following things:
Doc/library/typing.rst
with simply "foo"
import contextlib import shutil import subprocess import time import venv from pathlib import Path def run(args): try: subprocess.run(args, check=True, capture_output=True, text=True) except subprocess.CalledProcessError as e: print(e.stdout) print(e.stderr) raise with contextlib.chdir("Doc"): try: for path in Path(".").iterdir(): if path.is_dir() and not str(path).startswith("."): for doc_path in path.rglob("*.rst"): if doc_path != Path("library/typing.rst"): doc_path.write_text("foo") venv.create(".venv", with_pip=True) run([ ".venv/bin/python", "-m", "pip", "install", "-r", "requirements.txt", "--no-binary=':all:'", ]) start = time.perf_counter() run([ ".venv/bin/python", "-m", "sphinx", "-b", "html", ".", "build/html", "library/typing.rst", ]) print(time.perf_counter() - start) shutil.rmtree(".venv") shutil.rmtree("build") finally: subprocess.run(["git", "restore", "."], check=True, capture_output=True)
Using a PGO-optimized build with LTO enabled, the script reports that there is a significant performance regression in Sphinx's parsing and building of library/typing.rst
between v3.13.0a1
and 909c6f7:
v13.0a1
the script reports a Sphinx build time of between 1.27s and 1.29s (I ran the script several times)A similar regression is reported in this (much slower) variation of the script that builds the entire set of CPython's documentation rather than just library/typing.rst
.
import contextlib import shutil import subprocess import time import venv def run(args): subprocess.run(args, check=True, text=True) with contextlib.chdir("Doc"): venv.create(".venv", with_pip=True) run([ ".venv/bin/python", "-m", "pip", "install", "-r", "requirements.txt", "--no-binary=':all:'", ]) start = time.perf_counter() run([ ".venv/bin/python", "-m", "sphinx", "-b", "html", ".", "build/html", ]) print(time.perf_counter() - start) shutil.rmtree(".venv") shutil.rmtree("build")
The PGO-optimized timings for building the entire CPython documentation is as follows:
v3.13.0a1
: 45.5sThis indicates a 38% performance regression for building the entire set of CPython's documentation.
Cause of the performance regressionThis performance regression was initially discovered in #118891: in our own CI, we use a fresh build of CPython in our Doctest CI workflow (since otherwise, we wouldn't be testing the tip of the main
branch), and it was observed that the CI job was taking significantly longer on the 3.13
branch than the 3.12
branch. In the context of our CI, the performance regression is even worse, because of the fact that our Doctest CI workflow uses a debug build rather than a PGO-optimized build, and the regression is even more pronounced in a Debug build.
Using a debug build, I used the first script posted above to bisect the performance regression to commit 1530932 (below), which seemed to cause a performance regression of around 300% in a debug build
15309329b65a285cb7b3071f0f08ac964b61411b is the first bad commit
commit 15309329b65a285cb7b3071f0f08ac964b61411b
Author: Mark Shannon <mark@hotpy.org>
Date: Wed Mar 20 08:54:42 2024 +0000
GH-108362: Incremental Cycle GC (GH-116206)
Doc/whatsnew/3.13.rst | 30 +
Include/internal/pycore_gc.h | 41 +-
Include/internal/pycore_object.h | 18 +-
Include/internal/pycore_runtime_init.h | 8 +-
Lib/test/test_gc.py | 72 +-
.../2024-01-07-04-22-51.gh-issue-108362.oB9Gcf.rst | 12 +
Modules/gcmodule.c | 25 +-
Objects/object.c | 21 +
Objects/structseq.c | 5 +-
Python/gc.c | 806 +++++++++++++--------
Python/gc_free_threading.c | 23 +-
Python/import.c | 2 +-
Python/optimizer.c | 2 +-
Tools/gdb/libpython.py | 7 +-
14 files changed, 684 insertions(+), 388 deletions(-)
create mode 100644 Misc/NEWS.d/next/Core and Builtins/2024-01-07-04-22-51.gh-issue-108362.oB9Gcf.rst
Performance was then significantly improved by commit e28477f (below), but it's unfortunately still the case that Sphinx is far slower on Python 3.13 than on Python 3.12:
commit e28477f214276db941e715eebc8cdfb96c1207d9
Author: Mark Shannon <mark@hotpy.org>
Date: Fri Mar 22 18:43:25 2024 +0000
GH-117108: Change the size of the GC increment to about 1% of the total heap size. (GH-117120)
Include/internal/pycore_gc.h | 3 +-
Lib/test/test_gc.py | 35 +++++++++++++++-------
.../2024-03-21-12-10-11.gh-issue-117108._6jIrB.rst | 3 ++
Modules/gcmodule.c | 2 +-
Python/gc.c | 30 +++++++++----------
Python/gc_free_threading.c | 2 +-
6 files changed, 47 insertions(+), 28 deletions(-)
create mode 100644 Misc/NEWS.d/next/Core and Builtins/2024-03-21-12-10-11.gh-issue-117108._6jIrB.rst
See #118891 (comment) for more details on the bisection results.
Profiling by @nascheme in #118891 (comment) and #118891 (comment) also confirms that Sphinx spends a significant amount of time in the GC, so it seems very likely that the changes to introduce an incremental GC in Python 3.13 is the cause of this performance regression.
Cc. @markshannon for expertise on the new incremental GC, and cc. @hugovk / @AA-Turner for Sphinx expertise.
CPython versions tested on:3.12, 3.13, CPython main branch
Operating systems tested on:macOS
Linked PRsvasily-v-ryabovPrivat33r-dev
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4