Stay organized with collections Save and categorize content based on your preferences.
This document provides best practices for tuning Google Cloud resources for optimal performance of high performance computing (HPC) workloads.
Tip: You can configure all recommendations in this document by using a cluster blueprint. To review cluster blueprints that follow these best practices, see Cluster blueprint catalog.
Use the compute-optimized machine typeUse the compute-optimized machine family: H3, C2, or C2D. Virtual machine (VM) instances created with this machine type have a fixed virtual-to-physical core mapping. They also expose NUMA cell architecture to the guest OS. Both features are critical for the performance of tightly-coupled HPC applications.
To reduce communication overhead between VM nodes, consolidate onto a smaller number of c2-standard-60
or c2d-standard-112
VMs (with the same total core count) instead of launching a larger number of smaller C2 or C2D VMs. Inter-node communication is the greatest bottleneck in MPI workloads. Larger VM shapes minimize this communication.
To reduce internode latency, VM instance placement policies let you control the placement of VMs in Google Cloud data centers. We recommend compact placement policies because they provide lower-latency communication within a single zone.
Use the HPC VM imageUse the HPC VM image, which incorporates best practices for running HPC applications on Google Cloud. These images are based on Rocky Linux 8 and are available at no additional cost on Google Cloud.
Disable automatic updates Caution: If your installed software is out-of-date, this situation can cause a security risk.Automatic updates can significantly and unpredictably degrade performance. To disable automatic updates, use the google_disable_automatic_updates
metadata flag on VMs that use HPC VM images version v20240712
or later. Any VM image that has an HPC VM image as its base can also use this feature, for example, Slurm images.
For example, this setting affects dnf
automatic package updates on the following image families:
hpc-rocky-linux-8
(project cloud-hpc-image-public
)slurm-gcp-6-6-hpc-rocky-linux-8
(project schedmd-slurm-public
)Cluster Toolkit provides a convenient setting on relevant modules to set this metadata flag for you: allow_automatic_updates: false
. Here is an example using the vm-instance
module:
- id: workstation-rocky source: modules/compute/vm-instance use: [network] settings: allow_automatic_updates: false
Here is an example for a Slurm nodeset:
- id: dynamic_nodeset source: community/modules/compute/schedmd-slurm-gcp-v6-nodeset use: [network] settings: node_count_static: 1 node_count_dynamic_max: 4 allow_automatic_updates: falseAdjust HPC VM image tunings
To get the best performance on Google Cloud, use the following image tunings.
You can use the following sample command to manually configure a VM to run HPC workloads. However, Cluster Toolkit automatically handles all of this tuning when you use a cluster blueprint.
To create the VM manually, use the Google Cloud CLI and provide the following settings.
gcloud compute instances create VM_NAME \ --image-family=hpc-rocky-linux-8 \ --image-project=cloud-hpc-image-public \ --machine-type=MACHINE_TYPE \ --network-interface=nic-type=GVNIC \ --metadata=google_mpi_tuning=--hpcthroughput \ --threads-per-core=1
The preceding sample command applies the following tunings:
Sets Google Virtual NIC (gVNIC) network interface to enable better communication performance and higher throughput: --network-interface=nic-type=GVNIC
.
Sets network HPC throughput profile: --metadata=google_mpi_tuning=--hpcthroughput
.
If the VM already exists, run sudo google_mpi_tuning --hpcthroughput
to update the network HPC throughput profile setting.
Disables simultaneous multithreading (SMT) in the guest OS: --threads-per-core=1
.
If the VM already exists, run sudo google_mpi_tuning --nosmt
to disable simultaneous multithreading.
Turns off Meltdown and Spectre mitigations. The HPC VM image enables this setting by default.
Caution: Disabling these mitigations can incur a security risk.If the VM already exists, run sudo google_mpi_tuning --nomitigation
to turn off Meltdown and Spectre mitigations.
Each primary storage choice for tightly-coupled applications has its own cost, performance profile, APIs, and consistency semantics. The primary choices include the following:
Network File System (NFS) solutions, such as Filestore and Google Cloud NetApp Volumes. These solutions let you deploy shared storage options. Both Filestore and NetApp Volumes are fully managed by Google Cloud. Use them when your application does not have extreme I/O requirements to a single dataset. For performance limits, see the Filestore and NetApp Volumes documentation.
Google Cloud Managed Lustre is a fully managed POSIX-based parallel file system. This solution is commonly used by MPI applications.
For best performance, use Intel MPI.
For Ansys Fluent, use Intel MPI 2018.4.274
. Set the version of Intel MPI in Ansys Fluent by using the following command. Replace MPI_DIRECTORY
with the path to the directory that contains your Intel MPI library.
export INTELMPI_ROOT="MPI_DIRECTORY/compilers_and_libraries_2018.5.274/linux/mpi/intel64/"
Intel MPI collective algorithms can be tuned for optimal performance. The recommended collective algorithms for Ansys Fluent is -genv I_MPI_ADJUST_BCAST 8 -genv I_MPI_ADJUST_ALLREDUCE 10
.
For Simcenter STAR-CCM+, we also recommend you use the TCP fabric providers by specifying the following environment variables: I_MPI_FABRICS shm:ofi
and FI_PROVIDER tcp
.
The following is a summary of the recommended best practices for running HPC workloads on Google Cloud.
Use one of the following:
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-15 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-15 UTC."],[[["Utilize compute-optimized machine families like H3, C2, or C2D for optimal performance of high-performance computing (HPC) workloads due to their fixed virtual-to-physical core mapping and NUMA cell architecture."],["Employ compact placement policies to minimize internode latency, ensuring lower-latency communication within a single zone for HPC applications."],["Leverage the HPC VM image, which is based on Rocky Linux 8 and includes integrated best practices for running HPC applications on Google Cloud, and also disable automatic updates using the provided metadata flag."],["Configure your file system tunings by choosing between managed services such as Filestore, POSIX-based parallel file systems like DDN Lustre, or Intel DAOS, depending on your application's I/O needs."],["For optimal performance, use Intel MPI, with specific version and environment variable recommendations provided for applications like Ansys Fluent and Simcenter STAR-CCM+."]]],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4