Is your CI/CD pipeline slow? Do wait times make you feel unproductive? Parallel testing is an indispensable technique for reducing wait times. And mastering it is key to getting the most out of CI/CD.
Learn how to save time, cut costs, and boost your feedback loop. You can listen to our podcast episode or watch the video below!
What is Parallel Testing?As evident as it may be, it bears repeating that testing must be automated. We can’t start a discussion about parallel testing without clarifying that it only makes sense when it is automated.
Our starting definition may look deceptively simple: we say we’re performing parallel testing when two or more tests are run simultaneously. Another way of demonstrating this is by showing a depiction of non-parallel testing. Take a look at this continuous integration pipeline.
Series testsHere tests are sequential, each step can only begin after the previous one is done. Even if every test takes a few seconds, the whole process can still amount to minutes. Waiting for a build is much more than an annoyance; it’s distracting and energy-draining.
Once we start using parallel testing, we’ll begin seeing something more along these lines:
Parallel testsBy putting independent tests in parallel, we get rapid feedback and save precious minutes each time we make a change. Not only do we not lose focus on the problem we’re working on, we end up reclaiming many productive hours every week.
This is the power of parallel testing. Like upgrading your internet data plan or adding more lanes on a highway, it increases bandwidth — letting you do more work and getting farther faster.
Two forms of increasing concurrency
When it comes to parallelization, we need additional CPU power, we need extra machines to load the concurrent jobs. There are two ways of achieving this:
The litmus test to determine the effectiveness of your pipeline is measuring its total run time. CI/CD is all about feedback loops — the sooner we have a result, the sooner we can fix, refactor, and iterate. When continuous integration takes more than 10 minutes or when continuous delivery has us waiting for more than 20 minutes, it’s high time we started looking into optimization.
Large testing suites are the low-hanging fruit and should be the starting point for optimization. We begin by identifying the longest-running test and checking if it’s possible to break it up into smaller, independent jobs. Then, we repeat the process until the pipeline is fast enough for our needs.
Tests that cannot be broken apart can sometimes be re-arranged so they don’t stand in the way of the other jobs.
What makes a test a good candidate for parallelization? Check its inputs and outputs. Do the tests generate something other processes need? What does it take to run these tests? The fewer dependencies a test has, the more likely it will work well in parallel.
What follows are a few typically good use cases for parallel testing.
MonoreposMonorepos are code repositories containing many separate projects. As long as these projects are independent or loosely coupled, monorepos are a perfect fit for parallel testing.
When separation is not possible, when the projects are firmly interrelated, using parallelization becomes difficult. We can mitigate this problem with dedicated monorepo tools such as lerna or yarn.
Semaphore features first-class support for monorepos and can be configured to run tests only on the code that changes.
https://github.com/semaphoreci-demos/semaphore-demo-monorepo-javascriptA variation of this theme happens when we have interrelated components. For instance, a client and server in the same repository, which can be tested separately.
https://github.com/semaphoreci-demos/semaphore-demo-javascript Static code analysis testsCode analysis tests are another excellent candidate for parallel testing. Static tests represent the first line of defense in the quest to find errors in code. We find things like linters, coverage reports, and complexity analysis tools in this category. All of them efficiently run in parallel.
https://github.com/semaphoreci-demos/semaphore-demo-php-laravel Testing for multiple environments and operating systemsWe use the term environments in the most general way possible here, ranging from browsers to a mix of staging and production machines, and from a selection of mobile devices to different sets of data or API endpoints. The category can also include checking the application’s localization and internationalization features.
For example, testing an application for platforms such as iOS and Android is an excellent fit for parallelization.
https://github.com/semaphoreci-demos/semaphore-demo-react-nativeThe same applies to testing code on hybrid cloud environments.
https://github.com/semaphoreci-demos/semaphore-demo-static-website Version and regression testingTesting the code on various runtimes allows us to find compatibility errors. The following example uses a job matrix to run the tests in a combination of Java SDKs and application versions.
Build matrix Benefits of parallelizationThere are many advantages to using parallelization.
The bane of parallel testing is slow jobs. Even one abnormally slow job is enough to sink the total run time. The reason for this is that the pipeline cannot be faster than its slowest job, no matter how much parallelization we use. As Warren Buffet said: “You can’t produce a baby in one month by getting nine women pregnant.”
The solution is to identify and analyze slow jobs to see if they can be optimized or broken down. To aid in this, Semaphore offers a test summary feature to help us analyze job outputs for many popular testing frameworks. And in many cases, it makes more sense to simply use a faster machine.
There are other caveats to keep in mind while working with parallel testing:
In general terms, we have two ways of scaling parallel tests in CI/CD: vertically and horizontally.
Vertical parallelization happens when using tools that can isolate tasks and run them concurrently, taking advantage of the many cores the CI machine has. Examples are Pants, Bazel, or Earthly.
Vertical tests are great because they are almost always very hands-off, and the tool will decide and automatically figure out the best way of running the tests. The flip side is that once we exceed the capacity of one machine, we have to figure out how to distribute the load in a testing cluster.
On the other hand, in horizontal parallelization, we explicitly configure each job and direct the execution flow by design. We’ll see a few examples of how this works in the next section.
Both kinds of parallelization work well on Semaphore. You can scale tests using vertical parallelization by choosing a more powerful machine. And, since Semaphore is a cloud-based service, you get instant, automatic horizontal parallelization.
How to Parallelize Testing in SemaphoreSemaphore supports four levels of vertical parallelization: job, block, pipeline, and workflow.
1. Parallel jobsWe use parallel jobs to break up a big task into more manageable chunks. For example, to run static analysis tools on the code or test different platform versions with a job matrix.
Job-level parallelizationJob parallelization is the simplest form, and it’s just a matter of adding more than one job into the block.
Semaphore will run every job in the block simultaneously, using a separate and clean environment for each one.
2. Parallel blocksIndependent blocks can run in parallel and can have their own parallel jobs. This is the second level of parallelization. Parallel blocks are used in monorepo setups and for testing independent components of an app. They are also helpful for testing different environments and running longer jobs outside of the main sequence of steps.
Block-level parallelizationSemaphore automatically runs blocks in parallel when they don’t have dependencies.
3. Parallel pipelinesParallelized pipelines are typically used to deploy onto multiple targets at the same time.
Pipeline-level parallelizationTo run pipelines simultaneously, enable auto-promotions and set starting conditions to trigger continuous delivery or continuous deployment.
4. Parallel workflowsIn the final level, we have workflow parallelization. To make things a bit clearer, a workflow is a group of pipelines triggered by a change in the repository. We can choose to run workflows in parallel or in series by assigning them to queues.
Workflow-level parallelizationSemaphore uses queues for workflows, so configuring parallel workflow is vital to avoid waiting for pipelines while working on highly active repositories. To learn more about how Semaphore manages simultaneous workflows, check out the parallel pipelines page.
Note that you can only take advantage of parallelization in Semaphore if you are on a paid, trial, or open-source plan. On the other hand, the free plan limits users to one job at a time.
Tips for setting up parallel testsHere are a few pointers on setting up parallel tests:
Parallel testing lets us do more while waiting less. It’s an essential tool to keep sharp and ready so we can always establish a fast feedback loop. However, this isn’t an exact science, so when you start using parallel testing, ease in gradually and allow for a bit of trial and error to find the right balance for your project.
To continue reading about testing, check these great posts:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4