This page shows you how to consume reserved Compute Engine zonal resources in specific GKE workloads. These capacity reservations give you a high level of assurance that specific hardware is available for your workloads.
Ensure that you're already familiar with the concepts of Compute Engine reservations, like consumption types, share types, and provisioning types. For details, see Reservations of Compute Engine zonal resources.
This page is intended for the following people:
Compute Engine capacity reservations let you provision specific hardware configurations in Google Cloud zones, either immediately or at a specified future time. You can then consume this reserved capacity in GKE.
Depending on your GKE mode of operation, you can consume the following reservation types:
To enable consuming reservations to create your resources, you must specify a reservation affinity, like any
or specific
.
GKE lets you consume reservations directly in individual workloads by using Kubernetes nodeSelectors in your workload manifest or by creating Standard mode node pools that consume the reservation. This page describes the approach of directly selecting reservations in individual resources.
You can also configure GKE to consume reservations during scaling operations that create new nodes by using custom compute classes. Custom compute classes let platform administrators define a hierarchy of node configurations for GKE to prioritize during node scaling so that workloads run on your selected hardware.
You can specify reservations in your custom compute class configuration so that any GKE workload that uses that custom compute class indicates to GKE to consume the specified reservations for that compute class.
To learn more, in the "About custom compute classes" page, see Consume Compute Engine reservations.
Before you beginBefore you start, make sure that you have performed the following tasks:
gcloud components update
. Note: For existing gcloud CLI installations, make sure to set the compute/region
property. If you use primarily zonal clusters, set the compute/zone
instead. By setting a default location, you can avoid errors in the gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location
. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.Autopilot clusters support consuming resources from Compute Engine capacity reservations in the same project or in a shared project. You must set the consumption type property of the target reservation to specific, and you must explicitly select that reservation in your manifest. If you don't explicitly specify a reservation, Autopilot clusters won't consume reservations. To learn more about reservation consumption types, see How reservations work.
These reservations qualify for Compute flexible committed use discounts. You must use the Accelerator
compute class or the Performance
compute class to consume capacity reservations.
Before you begin, create an Autopilot cluster running the following versions:
Autopilot Pods can consume reservations that have the specific consumption type property in the same project as the cluster or in a shared reservation from a different project. You can consume the reserved hardware by explicitly referencing that reservation in your manifest. You can consume reservations in Autopilot for the following types of hardware:
Any of the following types of GPUs:
nvidia-b200
: NVIDIA B200 (180GB)nvidia-h200-141gb
: NVIDIA H200 (141GB)nvidia-h100-mega-80gb
: NVIDIA H100 Mega (80GB)nvidia-h100-80gb
: NVIDIA H100 (80GB)nvidia-a100-80gb
: NVIDIA A100 (80GB)nvidia-tesla-a100
: NVIDIA A100 (40GB)nvidia-l4
: NVIDIA L4nvidia-tesla-t4
: NVIDIA T4Any of the following types of TPUs:
tpu-v6e-slice
: TPU v6e slicetpu-v5p-slice
: TPU v5p slicetpu-v5-lite-podslice
: TPU v5 lite podslicetpu-v5-lite-device
: TPU v5 lite devicetpu-v4-lite-device
: TPU v4 lite devicetpu-v4-podslice
: TPU v4 podslicetpu-v3-device
: TPU v3 devicetpu-v3-slice
: TPU v3 podsliceTo create a capacity reservation, see the following resources. The reservation must meet the following requirements:
The reservation uses the specific consumption type. For example, in the gcloud CLI, you must specify the --require-specific-reservation
flag when you create the reservation.
GKE automatically attaches any Local SSDs from the selected specific reservation to your node. You don't need to select individual Local SSDs in your workload manifest. For example, if the reservation that you select includes two Local SSDs, the nodes that GKE creates from that reservation have two Local SSDs attached.
Consume a specific reservation in the same project in AutopilotThis section shows you how to consume a specific capacity reservation that's in the same project as your cluster. You can use kubectl or Terraform.
kubectlSave the following manifest as specific-autopilot.yaml
. This manifest has node selectors that consume a specific reservation. You can use VM instances or accelerators.
VM instances
apiVersion: v1
kind: Pod
metadata:
name: specific-same-project-pod
spec:
nodeSelector:
cloud.google.com/compute-class: Performance
cloud.google.com/machine-family: MACHINE_SERIES
cloud.google.com/reservation-name: RESERVATION_NAME
cloud.google.com/reservation-affinity: "specific"
containers:
- name: my-container
image: "k8s.gcr.io/pause"
resources:
requests:
cpu: 2
memory: "4Gi"
Replace the following:
MACHINE_SERIES
: a machine series that contains the machine type of the VMs in your specific capacity reservation. For example, if your reservation is for c3-standard-4
machine types, specify c3
in the MACHINE_SERIES
field.RESERVATION_NAME
: the name of the Compute Engine capacity reservation.GPU Accelerators
apiVersion: v1
kind: Pod
metadata:
name: specific-same-project-pod
spec:
nodeSelector:
cloud.google.com/gke-accelerator: ACCELERATOR
cloud.google.com/reservation-name: RESERVATION_NAME
cloud.google.com/reservation-affinity: "specific"
containers:
- name: my-container
image: "k8s.gcr.io/pause"
resources:
requests:
cpu: 12
memory: "50Gi"
ephemeral-storage: "200Gi"
limits:
nvidia.com/gpu: QUANTITY
Replace the following:
ACCELERATOR
: the accelerator that you reserved in the Compute Engine capacity reservation. Must be one of the following values:
nvidia-b200
: NVIDIA B200 (180GB)nvidia-h200-141gb
: NVIDIA H200 (141GB)nvidia-h100-mega-80gb
: NVIDIA H100 Mega (80GB)nvidia-h100-80gb
: NVIDIA H100 (80GB)nvidia-a100-80gb
: NVIDIA A100 (80GB)nvidia-tesla-a100
: NVIDIA A100 (40GB)nvidia-l4
: NVIDIA L4nvidia-tesla-t4
: NVIDIA T4RESERVATION_NAME
: the name of the Compute Engine capacity reservation.QUANTITY
: the number of GPUs to attach to the container. Must be a supported quantity for the specified GPU, as described in Supported GPU quantities.TPU Accelerators
apiVersion: v1
kind: Pod
metadata:
name: specific-same-project-pod
spec:
nodeSelector:
cloud.google.com/gke-tpu-accelerator: ACCELERATOR
cloud.google.com/gke-tpu-topology: TOPOLOGY
cloud.google.com/reservation-name: RESERVATION_NAME
cloud.google.com/reservation-affinity: "specific"
containers:
- name: my-container
image: "k8s.gcr.io/pause"
resources:
requests:
cpu: 12
memory: "50Gi"
ephemeral-storage: "200Gi"
limits:
google.com/tpu: QUANTITY
Replace the following:
ACCELERATOR
: the accelerator that you reserved in the Compute Engine capacity reservation. Must be one of the following values:
tpu-v6e-slice
: TPU v6e slicetpu-v5p-slice
: TPU v5p slicetpu-v5-lite-podslice
: TPU v5 lite podslicetpu-v5-lite-device
: TPU v5 lite devicetpu-v4-lite-device
: TPU v4 lite devicetpu-v4-podslice
: TPU v4 podslicetpu-v3-device
: TPU v3 devicetpu-v3-slice
: TPU v3 podsliceTOPOLOGY
: the TPU topology.RESERVATION_NAME
: the name of the Compute Engine capacity reservation.QUANTITY
: the number of TPUs to attach to the container. Must be aligned with the TPU topology.Deploy the Pod:
kubectl apply -f specific-autopilot.yaml
Autopilot uses the reserved capacity in the specified reservation to provision a new node to place the Pod.
TerraformTo consume a specific reservation in the same project with VM instances using Terraform, refer to the following example:
To consume a specific reservation in the same project with the Accelerator compute class using Terraform, refer to the following example:
To learn more about using Terraform, see Terraform support for GKE.
This section uses the following terms:
To consume a shared reservation, you must grant the GKE service agent access to the reservation in the project that owns the reservation. Do the following:
Create a custom IAM role that contains the compute.reservations.list
permission in the owner project:
gcloud iam roles create ROLE_NAME \
--project=OWNER_PROJECT_ID \
--permissions='compute.reservations.list'
Replace the following:
ROLE_NAME
: a name for your new role.OWNER_PROJECT_ID
: the project ID of the project that owns the capacity reservation.Give the GKE service agent in the consumer project access to list shared reservations in the owner project:
gcloud projects add-iam-policy-binding OWNER_PROJECT_ID \
--project=OWNER_PROJECT_ID \
--member=serviceAccount:service-CONSUMER_PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com \
--role='projects/OWNER_PROJECT_ID/roles/ROLE_NAME'
Replace CONSUMER_PROJECT_NUMBER
with the numerical project number of your consumer project. To find this number, see Identifying projects in the Resource Manager documentation.
Save the following manifest as shared-autopilot.yaml
. This manifest has nodeSelectors that tell GKE to consume a specific shared reservation.
apiVersion: v1
kind: Pod
metadata:
name: performance-pod
spec:
nodeSelector:
cloud.google.com/compute-class: Performance
cloud.google.com/machine-family: MACHINE_SERIES
cloud.google.com/reservation-name: RESERVATION_NAME
cloud.google.com/reservation-project: OWNER_PROJECT_ID
cloud.google.com/reservation-affinity: "specific"
containers:
- name: my-container
image: "k8s.gcr.io/pause"
resources:
requests:
cpu: 2
memory: "4Gi"
Replace the following:
MACHINE_SERIES
: a machine series that contains the machine type of the VMs in your specific capacity reservation. For example, if your reservation is for c3-standard-4
machine types, specify c3
in the MACHINE_SERIES
field.RESERVATION_NAME
: the name of the Compute Engine capacity reservation.OWNER_PROJECT_ID
: the project ID of the project that owns the capacity reservation.apiVersion: v1
kind: Pod
metadata:
name: specific-same-project-pod
spec:
nodeSelector:
cloud.google.com/gke-accelerator: ACCELERATOR
cloud.google.com/reservation-name: RESERVATION_NAME
cloud.google.com/reservation-project: OWNER_PROJECT_ID
cloud.google.com/reservation-affinity: "specific"
containers:
- name: my-container
image: "k8s.gcr.io/pause"
resources:
requests:
cpu: 12
memory: "50Gi"
ephemeral-storage: "200Gi"
limits:
nvidia.com/gpu: QUANTITY
Replace the following:
ACCELERATOR
: the accelerator that you reserved in the Compute Engine capacity reservation. Must be one of the following values:
nvidia-b200
: NVIDIA B200 (180GB)nvidia-h200-141gb
: NVIDIA H200 (141GB)nvidia-h100-mega-80gb
: NVIDIA H100 Mega (80GB)nvidia-h100-80gb
: NVIDIA H100 (80GB)nvidia-a100-80gb
: NVIDIA A100 (80GB)nvidia-tesla-a100
: NVIDIA A100 (40GB)nvidia-l4
: NVIDIA L4nvidia-tesla-t4
: NVIDIA T4RESERVATION_NAME
: the name of the Compute Engine capacity reservation.OWNER_PROJECT_ID
: the project ID of the project that owns the capacity reservation.QUANTITY
: the number of GPUs to attach to the container. Must be a supported quantity for the specified GPU, as described in Supported GPU quantities.apiVersion: v1
kind: Pod
metadata:
name: specific-shared-project-pod
spec:
nodeSelector:
cloud.google.com/gke-tpu-accelerator: ACCELERATOR
cloud.google.com/gke-tpu-topology: TOPOLOGY
cloud.google.com/reservation-name: RESERVATION_NAME
cloud.google.com/reservation-project: OWNER_PROJECT_ID
cloud.google.com/reservation-affinity: "specific"
containers:
- name: my-container
image: "k8s.gcr.io/pause"
resources:
requests:
cpu: 12
memory: "50Gi"
ephemeral-storage: "200Gi"
limits:
google.com/tpu: QUANTITY
Replace the following:
ACCELERATOR
: the accelerator that you reserved in the Compute Engine capacity reservation. Must be one of the following values:
tpu-v6e-slice
: TPU v6e slicetpu-v5p-slice
: TPU v5p slicetpu-v5-lite-podslice
: TPU v5 lite podslicetpu-v5-lite-device
: TPU v5 lite devicetpu-v4-lite-device
: TPU v4 lite devicetpu-v4-podslice
: TPU v4 podslicetpu-v3-device
: TPU v3 devicetpu-v3-slice
: TPU v3 podsliceTOPOLOGY
: the TPU topology.RESERVATION_NAME
: the name of the Compute Engine capacity reservation.OWNER_PROJECT_ID
: the project ID of the project that owns the capacity reservation.QUANTITY
: the number of TPUs to attach to the container. Must be aligned with the TPU topology.Deploy the Pod:
kubectl apply -f shared-autopilot.yaml
Autopilot uses the reserved capacity in the specified reservation to provision a new node to place the Pod.
Consume a specific reservation block in AutopilotThis section shows you how to consume a specific capacity reservation block that's in the same project as your cluster or in a shared project. This feature is available only for specific accelerators. You can use kubectl
to configure your Pod to consume the reservation block.
Save the following manifest as reservation-block-autopilot.yaml
. This manifest has node selectors that consume a specific reservation.
Local Project
apiVersion: v1
kind: Pod
metadata:
name: specific-same-project-pod
spec:
nodeSelector:
cloud.google.com/gke-accelerator: ACCELERATOR
cloud.google.com/reservation-name: RESERVATION_NAME
cloud.google.com/reservation-affinity: "specific"
cloud.google.com/reservation-blocks: RESERVATION_BLOCKS_NAME
containers:
- name: my-container
image: "k8s.gcr.io/pause"
resources:
requests:
cpu: 12
memory: "50Gi"
ephemeral-storage: "200Gi"
limits:
nvidia.com/gpu: QUANTITY
Replace the following:
ACCELERATOR
: the accelerator that you reserved in the Compute Engine capacity reservation. Must be one of the following values:
nvidia-b200
: NVIDIA B200 (180GB)nvidia-h200-141gb
: NVIDIA H200 (141GB)RESERVATION_NAME
: the name of the Compute Engine capacity reservation.RESERVATION_BLOCKS_NAME
: the name of the Compute Engine capacity reservation block.QUANTITY
: the number of GPUs to attach to the container. Must be a supported quantity for the specified GPU, as described in Supported GPU quantities.For reservations that are owned by a different project, add the cloud.google.com/reservation-project: OWNER_PROJECT_ID to the spec.nodeSelector
field. Replace OWNER_PROJECT_ID
with the project ID of the project that owns the capacity reservation.
Deploy the Pod:
kubectl apply -f reservation-block-autopilot.yaml
Autopilot uses the reserved capacity in the specified reservation block to provision a new node to place the Pod.
This section shows you how to consume a specific capacity reservation sub-block that's in the same project as your cluster or in a shared project.
Save the following ComputeClass manifest as reservation-sub-block-computeclass.yaml
:
apiVersion: cloud.google.com/v1
kind: ComputeClass
metadata:
name: specific-reservation-subblock
spec:
nodePoolAutoCreation:
enabled: true
priorities:
- gpu:
type: ACCELERATOR_TYPE
count: ACCELERATOR_COUNT
reservations:
affinity: Specific
specific:
- name: RESERVATION_NAME
project: RESERVATION_PROJECT_ID
reservationBlock:
name: RESERVATION_BLOCK_NAME
reservationSubBlock:
name: RESERVATION_SUB_BLOCK_NAME
Replace the following:
ACCELERATOR_TYPE
: the accelerator that you reserved in the Compute Engine capacity reservation. This value must be nvidia-gb200
.ACCELERATOR_COUNT
: the number of accelerators to attach to each node. This value must be a supported quantity for the specified accelerator type. For more information, see Supported GPU quantities.RESERVATION_NAME
: the name of the Compute Engine capacity reservation.RESERVATION_PROJECT_ID
: the project ID of the project that owns the capacity reservation.RESERVATION_BLOCK_NAME
: the name of the Compute Engine capacity reservation block.RESERVATION_SUB_BLOCK_NAME
: the name of the Compute Engine capacity reservation sub-block.Save the following Pod manifest as reservation-sub-block-pod.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: reservation-sub-block-pod
spec:
nodeSelector:
cloud.google.com/compute-class: specific-reservation-subblock
containers:
- name: my-container
image: "k8s.gcr.io/pause"
resources:
requests:
cpu: 12
memory: "50Gi"
ephemeral-storage: "200Gi"
limits:
nvidia.com/gpu: CONTAINER_GPU_COUNT
Replace CONTAINER_GPU_COUNT
with the number of GPUs to allocate to the container. This value must be less than or equal to the number of accelerators that the compute class attaches to each node.
Deploy the Pod:
kubectl apply -f reservation-sub-block-pod.yaml
Autopilot uses the reserved capacity in the specified reservation sub-block to provision a new node to run the Pod.
When you create a cluster or node pool, you can indicate the reservation consumption mode by specifying the --reservation-affinity
flag.
You can create a reservation and instances to consume any reservation using the gcloud CLI or Terraform.
gcloudTo consume from any matching reservations automatically, set the reservation affinity flag to --reservation-affinity=any
. Since any
is the default value defined in Compute Engine, you can omit the reservation affinity flag entirely.
In the any
reservation consumption mode, nodes first take capacity from all single-project reservations before any shared reservations, because the shared reservations are more available to other projects. For more information about how instances are automatically consumed see Consumption order.
Create a reservation of three VM instances:
gcloud compute reservations create RESERVATION_NAME \
--machine-type=MACHINE_TYPE --vm-count=3
Replace the following:
RESERVATION_NAME
: the name of the reservation to create.MACHINE_TYPE
: the type of machine (name only) to use for the reservation. For example, n1-standard-2
.Verify the reservation was created successfully:
gcloud compute reservations describe RESERVATION_NAME
Create a cluster having one node to consume any matching reservation:
gcloud container clusters create CLUSTER_NAME \
--machine-type=MACHINE_TYPE --num-nodes=1 \
--reservation-affinity=any
Replace CLUSTER_NAME
with the name of the cluster to create.
Create a node pool with three nodes to consume any matching reservation:
gcloud container node-pools create NODEPOOL_NAME \
--cluster CLUSTER_NAME --num-nodes=3 \
--machine-type=MACHINE_TYPE --reservation-affinity=any
Replace NODEPOOL_NAME
with the name of the node pool to create.
The total number of nodes is four, which exceeds the capacity of the reservation. Three of the nodes consume the reservation while the last node takes capacity from the general Compute Engine resource pool.
TerraformTo create a reservation of three VM instances using Terraform, refer to the following example:
To create a cluster having one node to consume any matching reservation using Terraform, refer to the following example:
To create a node pool with three nodes to consume any matching reservation using Terraform, refer to the following example:
To learn more about using Terraform, see Terraform support for GKE.
Consuming a specific single-project reservation Note: If you're creating a node pool with GPUs, and you're consuming any reservation where the capacity is in a single zone, ensure that you follow the additional recommendations for the scenario of capacity in a single zone. For more information, see Consuming GPU reservations.To consume a specific reservation, set the reservation affinity flag to --reservation-affinity=specific
and provide the specific reservation name. In this mode, instances must take capacity from the specified reservation in the zone. The request fails if the reservation does not have sufficient capacity.
To create a reservation and instances to consume a specific reservation, perform the following steps. You can use the gcloud CLI or Terraform.
gcloudCreate a specific reservation for three VM instances:
gcloud compute reservations create RESERVATION_NAME \
--machine-type=MACHINE_TYPE --vm-count=3 \
--require-specific-reservation
Replace the following:
RESERVATION_NAME
: the name of the reservation to create.MACHINE_TYPE
: the type of machine (name only) to use for the reservation. For example, n1-standard-2
.Create a node pool with a single node to consume a specific single-project reservation:
gcloud container node-pools create NODEPOOL_NAME \
--cluster CLUSTER_NAME \
--machine-type=MACHINE_TYPE --num-nodes=1 \
--reservation-affinity=specific --reservation=RESERVATION_NAME
Replace the following:
NODEPOOL_NAME
: the name of the node pool to create.CLUSTER_NAME
: the name of the cluster that you created.To create a specific reservation using Terraform, refer to the following example:
To create a node pool with a single node to consume a specific single-project reservation using Terraform, refer to the following example:
To learn more about using Terraform, see Terraform support for GKE.
To create a specific shared reservation and consume the shared reservation, perform the following steps. You can use the gcloud CLI or Terraform.
Create a specific shared reservation:
gcloud compute reservations create RESERVATION_NAME \
--machine-type=MACHINE_TYPE --vm-count=3 \
--zone=ZONE \
--require-specific-reservation \
--project=OWNER_PROJECT_ID \
--share-setting=projects \
--share-with=CONSUMER_PROJECT_IDS
Replace the following:
RESERVATION_NAME
: the name of reservation to create.MACHINE_TYPE
: the name of the type of machine to use for the reservation. For example, n1-standard-2
.OWNER_PROJECT_ID
: the project ID of the project that you want to create this shared reservation. If you omit the --project
flag, GKE uses the current project as the owner project by default.CONSUMER_PROJECT_IDS
: a comma-separated list of the project IDs of projects that you want to share this reservation with. For example, project-1,project-2
. You can include 1 to 100 consumer projects. These projects must be in the same organization as the owner project. Don't include the OWNER_PROJECT_ID
, because it can consume this reservation by default.Consume the shared reservation:
gcloud container node-pools create NODEPOOL_NAME \
--cluster CLUSTER_NAME \
--machine-type=MACHINE_TYPE --num-nodes=1 \
--reservation-affinity=specific \
--reservation=projects/OWNER_PROJECT_ID/reservations/RESERVATION_NAME
Replace the following:
NODEPOOL_NAME
: the name of the node pool to create.CLUSTER_NAME
: the name of the cluster that you created.To create a specific shared reservation using Terraform, refer to the following example:
To consume the specific shared reservation using Terraform, refer to the following example:
To learn more about using Terraform, see Terraform support for GKE.
Additional considerations for consuming from a specific reservationWhen a node pool is created with specific reservation affinity, including default node pools during cluster creation, its size is limited to the capacity of the specific reservation over the node pool's entire lifetime. This affects the following GKE features:
To create a Standard node pool that consumes a GPU reservation, or that consumes any reservation where the capacity is in a single zone, you must specify the --node-locations
flag when you add a node pool. When you create a regional Standard cluster or multi-zonal Standard cluster, specifying the node locations ensures that GKE creates nodes only in a zone where you have reserved GPU capacity.
For detailed instructions about creating a node pool which uses GPUs, refer to Create a GPU node pool.
Consuming TPU reservationsTo create a Standard node pool that consumes a TPU reservation you must specify the --node-locations
flag when you add a node pool. When you create a regional Standard cluster or multi-zonal Standard cluster, specifying the node locations ensures that GKE creates nodes only in a zone where you have reserved TPU capacity.
TPU reservations differ from other machine types. The following are TPU-specific aspects you should consider when creating TPU reservations:
SPECIFIC
is the only supported value for the --reservation-affinity
flag.For detailed instructions about creating a node pool which uses TPUs, refer to Create a TPU node pool.
Creating nodes without consuming reservationsTo explicitly avoid consuming resources from any reservations, set the affinity to --reservation-affinity=none
.
Create a cluster that won't consume any reservation:
gcloud container clusters create CLUSTER_NAME --reservation-affinity=none
Replace CLUSTER_NAME
with the name of the cluster to create.
Create a node pool that won't consume any reservation:
gcloud container node-pools create NODEPOOL_NAME \
--cluster CLUSTER_NAME \
--reservation-affinity=none
Replace NODEPOOL_NAME
with the name of the node pool to create.
When using node pools running in multiple zones with reservations that are not equal between zones, you can use the flag --location_policy=ANY
. This ensures that when new nodes are added to the cluster they are created in the zone that still has unused reservations.
To avoid incurring charges to your Cloud Billing account for the resources used in this page:
Delete the clusters you created by running the following command for each of the clusters:
gcloud container clusters delete CLUSTER_NAME
Delete the reservations you created by running the following command for each of the reservations:
gcloud compute reservations delete RESERVATION_NAME
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4