Updated summary by @ezyang. Previously, this ticket talked about all sorts of parallelism at many levels. Component-level parallelism was already addressed in #2623 (fixed by per-component builds), so all that remains is per-module parallelism. This is substantially more difficult, because right now we build by invoking ghc --make
; achieving module parallelism would require teaching Cabal how to build using ghc -c
. But this too has a hazard: if you don't have enough cores/have a serial dependency graph, ghc -c
will be slower, because GHC spends more time reloading interface files. In #976 (comment) @dcoutts describes how to overcome this problem.
There are several phases to the problem:
First building the GHC build server and parallelism infrastructure. This can be done completely independently of Cabal: imagine a program which has a command line identical to GHC, but is internally implemented by spinning up multiple GHC processes and farming out the compilation process. You can tell if this was worthwhile when you get scaling better than GHC's built-in -j
and a traditional -c
setup.
Next, we need to teach Cabal/cabal-install how to take advantage of this functionality. If you implemented your driver program with exactly the same command line flags as GHC, then this is as simple as just passing -w $your_parallel_ghc_impl
. However, this is a problem doing it this way: cabal-install will attempt to spin up N parallel package/component builds, which each in turn will try to spin up M GHC build servers; this is bad; you want the total number of GHC build servers to equal the number of cores. So then you will need to setup some sort signalling mechanism to avoid too many build servers from running at once, OR have cabal new-build orchestrate the entire build down to the module level so it can plan parallelism (but you would probably have to rearchitect according to Rewrite Cabal in Shake #4174 before you can do this.)
Now that the package-level parallel install has been implemented (see #440), the next logical step is to extend cabal build
with support for building multiple modules, components and/or build variants (static/shared/profiling) in parallel. This functionality should be also integrated with cabal install
in such a way that we don't over- or underutilise the available cores.
A prototype implementation of a parallel cabal build
is already available as a standalone tool. It works by first extracting a module dependency graph with 'ghc -M' and then running multiple 'ghc -c' processes in parallel.
Since the parallel install code uses the external setup method exclusively, integrating parallel cabal build
with parallel install will require using IPC. A single coordinating cabal install -j N
process will spawn a number of setup.exe build --semaphore=/path/to/semaphore
children, and each child will be building at most N modules simultaneously. An added benefit of this approach is that nothing special will have to be done to support custom setup scripts.
An important issue is that compiling with ghc -c
is slow compared to ghc --make
because the interface files are not cached. One way to fix this is to implement a "build server" mode for GHC. Instead of repeatedly running ghc -c
, each build process will spawn at most N persistent ghcs and distribute the modules between them. Evan Laforge has done some work in this direction.
Other issues:
cabal repl
patches).build-type: Simple
.RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4