By default, automatic upgrades are enabled for Google Kubernetes Engine (GKE) clusters and for GKE Standard node pools.
This page explains how to manually request an upgrade or downgrade for the control plane or nodes of a GKE cluster. You can manually upgrade the version as follows:
To upgrade a cluster, GKE updates the version the control plane and nodes are running. Clusters are upgraded to either a newer minor version (for example, 1.24 to 1.25) or newer patch version (for example, 1.24.2-gke.100 to 1.24.5-gke.200). For more information, see GKE versioning and support.
You can learn more about how automatic and manual cluster upgrades work. You can also control when auto-upgrades can and cannot occur by configuring maintenance windows and exclusions.
Note about terminology: A cluster's control plane and nodes do not necessarily run the same version at all times. In this topic, cluster upgrade and control plane upgrade are used interchangeably, and are differentiated from node upgrades.
New versions of GKE are announced regularly, and you can receive notice about the new versions available for each specific cluster with cluster notifications. To find specific auto-upgrade targets for clusters, get information about a cluster's upgrades.
To learn about available versions, see Versioning. To learn more about clusters, see Cluster architecture. For guidance on upgrading clusters, see Best practices for upgrading clusters.
Before you beginBefore you start, make sure that you have performed the following tasks:
gcloud components update
. Note: For existing gcloud CLI installations, make sure to set the compute/region
property. If you use primarily zonal clusters, set the compute/zone
instead. By setting a default location, you can avoid errors in the gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location
. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.Before upgrading a node pool, you must ensure that any data you need to keep is stored in a Pod using persistent volumes which use persistent disks. Persistent disks are unmounted, rather than erased, during upgrades, and their data is "handed off" between Pods.
The following restrictions pertain to persistent disks:
To learn how to add a persistent disk to an existing node instance, see Adding or resizing zonal persistent disks in the Compute Engine documentation.
About upgradingA cluster's control plane and nodes are upgraded separately.
Cluster control planes are always upgraded on a regular basis, regardless of whether your cluster is enrolled in a release channel or not.
LimitationsAlpha clusters cannot be upgraded.
Supported versionsThe release notes announce when new versions become available and when older versions are no longer available. At any time, you can list all supported cluster and node versions using this command:
gcloud container get-server-config \
--location=CONTROL_PLANE_LOCATION
Replace CONTROL_PLANE_LOCATION
with the location (region or zone) for the control plane, such as us-central1
or us-central1-a
.
If your cluster is enrolled in a release channel, you can upgrade to a patch version in a different release channel with the same minor version as your control plane. For example, you can upgrade your cluster from version 1.21.12-gke.1700 in the Regular channel to 1.21.13-gke.900 in the Rapid channel. For more information, refer to Running patch versions from a newer channel. All Autopilot clusters are enrolled in a release channel.
Downgrading limitationsYou can downgrade the version of your cluster to an earlier version in certain scenarios.
To mitigate an unsuccessful cluster control plane upgrade, you can downgrade your control plane to a previous patch release if the version is an earlier patch release within the same minor version. For example, if your cluster's control plane is running GKE 1.25.3-gke.400, you can downgrade the control plane to 1.25.2-gke.100, if that version is still available.
You can't downgrade a Kubernetes cluster control plane to an earlier minor version. For example, if your control plane runs GKE version 1.25, you cannot downgrade to 1.24. If you attempt to do this, the following error message appears:
ERROR: (gcloud.container.clusters.upgrade) ResponseError: code=400,
message=Master cannot be upgraded to "1.24.3-gke.100": specified version is not
newer than the current version.
You can't downgrade the minor version of a cluster's control plane, so we recommend that you test and qualify minor version upgrades with clusters in a testing environment when a new minor version becomes available but before the version becomes default. This is especially recommended if your cluster might be affected by significant changes in the next minor version, such as deprecated APIs or features being removed.
To mitigate an unsuccessful node pool upgrade, you can downgrade a node pool to an earlier patch release or minor version. Ensure that you don't downgrade nodes to a version that is more than two minor versions behind the cluster control plane version.
Upgrading the clusterGoogle upgrades clusters and nodes automatically. For more control over which auto-upgrades your cluster and its nodes receive, you can enroll it in a release channel. All Autopilot clusters are automatically enrolled in a release channel.
To learn more about managing your cluster's GKE version, see Upgrades.
You can initiate a manual upgrade any time after a new version becomes available.
Manually upgrading the control plane Note: You cannot upgrade your cluster more than one minor version at a time. For example, you can upgrade a cluster from version 1.25 to 1.26, but not directly from 1.25 to 1.27. For more information, refer to Can I skip versions during a cluster upgrade?.When initiating a cluster upgrade, you can't modify the cluster's configuration for several minutes, until the control plane is accessible again. If you need to prevent downtime during control plane upgrades, consider using an Autopilot cluster or a regional Standard cluster. This operation does not affect the availability of the worker nodes that your workloads run on as they remain available during control plane upgrades.
You can manually upgrade your Autopilot or Standard control plane using the Google Cloud console or the Google Cloud CLI.
gcloudTo see the available versions for your cluster's control plane, run the following command:
gcloud container get-server-config \
--location=CONTROL_PLANE_LOCATION
To upgrade to the default cluster version, run the following command:
gcloud container clusters upgrade CLUSTER_NAME \
--master \
--location=CONTROL_PLANE_LOCATION
To upgrade to a specific version that is not the default, specify the --cluster-version
flag as in the following command:
gcloud container clusters upgrade CLUSTER_NAME \
--master \
--location=CONTROL_PLANE_LOCATION \
--cluster-version=VERSION
Replace VERSION
with the version that you want to upgrade your cluster to. You can use a specific version, such as 1.18.17-gke.100
or you can use a version alias, like latest
. For more information, see Specifying cluster version.
VERSION
must be a valid minor version for the release channel, or a valid patch version in a newer release channel. All Autopilot clusters are enrolled in a release channel. Console
To manually update your cluster control plane, perform the following steps:
Go to the Google Kubernetes Engine page in Google Cloud console.
Click the name of the cluster.
Under Cluster basics, click edit Upgrade Available next to Version.
Select the desired version, then click Save Changes.
After upgrading a Standard control plane, you can upgrade its nodes. By default, Standard nodes created using the Google Cloud console have auto-upgrade enabled, so this happens automatically. Autopilot always upgrades nodes automatically.
Downgrading clusters Note: Before attempting to downgrade a cluster, ensure that you're already familiar with the limitations.Downgrade the cluster control plane to an earlier patch version:
gcloud container clusters upgrade CLUSTER_NAME \
--master \
--location=CONTROL_PLANE_LOCATION \
--cluster-version=VERSION
VERSION
must be an available patch version for the release channel. All Autopilot clusters are enrolled in a release channel. Disabling cluster auto-upgrades
Infrastructure security is high priority for GKE, and as such control planes are upgraded on a regular basis, and cannot be disabled. However, you can apply maintenance windows and exclusions to temporarily suspend upgrades for control planes and nodes.
Although it is not recommended, you can disable node auto-upgrade.
Check recent control plane upgrade historyFor a snapshot of a cluster's recent auto-upgrade history, get information about a cluster's upgrades.
Alternatively, you can list recent operations to see when the control plane was upgraded:
gcloud container operations list --filter="TYPE:UPGRADE_MASTER AND TARGET:CLUSTER_NAME" \
--location=CONTROL_PLANE_LOCATION
Upgrading node pools
By default, a cluster's nodes have auto-upgrade enabled. Node auto-upgrades ensure that your cluster's control plane and node version remain in sync and in compliance with the Kubernetes version skew policy, which ensures that control planes are compatible with nodes up to two minor versions older than the control plane. For example, Kubernetes 1.29 control planes are compatible with Kubernetes 1.27 nodes.
Best practice:Avoid disabling node auto-upgrades so that your cluster benefits from the upgrades listed in the preceding paragraph.
With GKE node pool upgrades, you can choose between two configurable upgrade strategies, namely surge upgrades and blue-green upgrades.
Choose a strategy and use the parameters to tune the strategy to best fit your cluster environment's needs.
How node upgrades workWhile a node is being upgraded, GKE stops scheduling new Pods onto it, and attempts to schedule its running Pods onto other nodes. This is similar to other events that re-create the node, such as enabling or disabling a feature on the node pool.
During automatic or manual node upgrades, PodDisruptionBudgets (PDBs) and Pod termination grace period are respected for a maximum of 1 hour. If Pods running on the node can't be scheduled onto new nodes after one hour, GKE initiates the upgrade anyway. This behavior applies even if you configure your PDBs to always have all of your replicas available by setting the maxUnavailable
field to 0
or 0%
or by setting the minAvailable
field to 100%
or to the number of replicas. In all of these scenarios, GKE deletes the Pods after one hour so that the node deletion can happen.
If a workload requires more flexibility with graceful termination, use blue-green upgrades which provide settings for additional soak time to extend PDB checks beyond the one hour default.
To learn more about what to expect during node termination in general, see the topic about Pods.
The upgrade is only complete when all nodes have been recreated and the cluster is in the desired state. When a newly-upgraded node registers with the control plane, GKE marks the node as schedulable.
New node instances run the desired Kubernetes version as well as:
For a node pool upgrade to be considered complete, all nodes in the node pool must be recreated. If an upgrade started but then didn't complete and is in a partially upgraded state, the node pool version might not reflect the version of all of the nodes. To learn more, see Some node versions don't match the node pool version after an incomplete node pool upgrade. To determine that the node pool upgrade finished, check the node pool upgrade status. If the upgrade operation is beyond the retention period, then check that each individual node version matches the node pool version.
Manually upgrade a node poolYou can manually upgrade a node pool version to match the version of the control plane or to a previous version that is still available and is compatible with the control plane. You can manually upgrade multiple node pools in parallel, whereas GKE automatically upgrades only one node pool at a time.
When you manually upgrade a node pool, GKE removes any labels you added to individual nodes using kubectl
. To avoid this, apply labels to node pools instead.
Before you manually upgrade your node pool, consider the following conditions:
kubectl get ing
command. If the instance group is not synced, you can work around the problem by re-applying the manifest used to create the ingress.You can manually upgrade your node pools to a version compatible with the control plane, using the Google Cloud console or the Google Cloud CLI.
gcloudThe following variables are used in the commands in this section:
CLUSTER_NAME
: the name of the cluster of the node pool to be upgraded.NODE_POOL_NAME
: the name of the node pool to be upgraded.CONTROL_PLANE_LOCATION
: the location (region or zone) for the control plane, such as us-central1
or us-central1-a
.VERSION
: the Kubernetes version to which the nodes are upgraded. For example, --cluster-version=1.7.2
or cluster-version=latest
.Upgrade a node pool:
gcloud container clusters upgrade CLUSTER_NAME \
--node-pool=NODE_POOL_NAME \
--location=CONTROL_PLANE_LOCATION
To specify a different version of GKE on nodes, use the optional --cluster-version
flag:
gcloud container clusters upgrade CLUSTER_NAME \
--node-pool=NODE_POOL_NAME \
--location=CONTROL_PLANE_LOCATION \
--cluster-version VERSION
For more information about specifying versions, see Versioning.
For more information, refer to the gcloud container clusters upgrade
documentation.
To upgrade a node pool using the Google Cloud console, perform the following steps:
Go to the Google Kubernetes Engine page in Google Cloud console.
Click the name of the cluster.
On the Cluster details page, click the Nodes tab.
In the Node Pools section, click the name of the node pool that you want to upgrade.
Click edit Edit.
Click Change under Node version.
Select the desired version from the Node version drop-down list, then click Change.
It may take several minutes for the node version to change.
Downgrading node poolsYou can downgrade a node pool, for example, to mitigate an unsuccessful node pool upgrade. Review the limitations before downgrading a node pool.
Best practice:Use the blue-green node upgrade strategy if you need to optimize for risk mitigation for node pool upgrades impacting your workloads. With this strategy, you can roll backan in-progress upgrade to the original nodes if the upgrade is unsuccessful.
To learn more about changing surge upgrade parameters, see Configure surge upgrades.
Checking node pool upgrade statusYou can check the status of an upgrade using gcloud container operations
.
View a list of every running and completed operation in the cluster from the last 12 days if there's fewer than 5,000 operations, or the last 5,000 operations:
gcloud container operations list \
--location=CONTROL_PLANE_LOCATION
Each operation is assigned an operation ID and an operation type as well as start and end times, target cluster, and status. The list appears similar to the following example:
NAME TYPE ZONE TARGET STATUS_MESSAGE STATUS START_TIME END_TIME
operation-1505407677851-8039e369 CREATE_CLUSTER us-west1-a my-cluster DONE 20xx-xx-xxT16:47:57.851933021Z 20xx-xx-xxT16:50:52.898305883Z
operation-1505500805136-e7c64af4 UPGRADE_CLUSTER us-west1-a my-cluster DONE 20xx-xx-xxT18:40:05.136739989Z 20xx-xx-xxT18:41:09.321483832Z
operation-1505500913918-5802c989 DELETE_CLUSTER us-west1-a my-cluster DONE 20xx-xx-xxT18:41:53.918825764Z 20xx-xx-xxT18:43:48.639506814Z
To get more information about a specific operation, specify the operation ID as shown in the following command:
gcloud container operations describe OPERATION_ID \
--location=CONTROL_PLANE_LOCATION
For example:
gcloud container operations describe operation-1507325726639-981f0ed6
endTime: '20xx-xx-xxT21:40:05.324124385Z'
name: operation-1507325726639-981f0ed6
operationType: UPGRADE_CLUSTER
selfLink: https://container.googleapis.com/v1/projects/.../kubernetes-engine/docs/zones/us-central1-a/operations/operation-1507325726639-981f0ed6
startTime: '20xx-xx-xxT21:35:26.639453776Z'
status: DONE
targetLink: https://container.googleapis.com/v1/projects/.../kubernetes-engine/docs/zones/us-central1-a/clusters/...
zone: us-central1-a
If the upgrade was cancelled or failed and is partially completed, you can resume or roll back the upgrade.
Checking node pool upgrade settingsYou can see details on the node upgrade strategy being used for your node pools using the gcloud container node-pools describe
command. For blue-green upgrades, the command also returns the current phase of the upgrade.
Run the following command:
gcloud container node-pools describe NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--location=CONTROL_PLANE_LOCATION
Replace the following:
NODE_POOL_NAME
: the name of the node pool to describe.CLUSTER_NAME
: the name of the cluster of the node pool to describe.CONTROL_PLANE_LOCATION
: the location (region or zone) for the control plane, such as us-central1
or us-central1-a
.This command will output the current upgrade settings. The following example shows the output if you are using the blue-green upgrade strategy.
upgradeSettings:
blueGreenSettings:
nodePoolSoakDuration: 1800s
standardRolloutPolicy:
batchNodeCount: 1
batchSoakDuration: 10s
strategy: BLUE_GREEN
If you are using the blue-green upgrade strategy, the output also includes details about the blue-green upgrade settings and its current intermediate phase. The following example shows what this might look like:
updateInfo:
blueGreenInfo:
blueInstanceGroupUrls:
- https://www.googleapis.com/compute/v1/projects/{PROJECT_ID}/zones/{LOCATION}/instanceGroupManagers/{BLUE_INSTANCE_GROUP_NAME}
bluePoolDeletionStartTime: {BLUE_POOL_DELETION_TIME}
greenInstanceGroupUrls:
- https://www.googleapis.com/compute/v1/projects/{PROJECT_ID}/zones/{LOCATION}/instanceGroupManagers/{GREEN_INSTANCE_GROUP_NAME}
greenPoolVersion: {GREEN_POOL_VERSION}
phase: DRAINING_BLUE_POOL
Canceling a node pool upgrade
You can cancel an upgrade at any time. To learn more about what happens when you cancel a surge upgrade, see Cancel a surge upgrade. To learn more about what happens when you cancel a blue-green upgrade, see Cancel a blue-green upgrade.
Get the upgrade's operation ID:
gcloud container operations list \
--location=CONTROL_PLANE_LOCATION
Cancel the upgrade:
gcloud container operations cancel OPERATION_ID \
--location=CONTROL_PLANE_LOCATION
Refer to the gcloud container operations cancel
documentation.
You can resume an upgrade by manually initiating the upgrade again, specifying the target version from the original upgrade.
If, for example, an upgrade failed, or if you paused an ongoing upgrade, you could resume the canceled upgrade by starting the same upgrade again on the node pool, specifying the target version from the initial upgrade operation.
To learn more about what happens when you resume an upgrade, see Resume a surge upgrade and blue-green upgrade.
To resume an upgrade, use the following command:
gcloud container clusters upgrade CLUSTER_NAME \
--node-pool=NODE_POOL_NAME \
--location=CONTROL_PLANE_LOCATION \
--cluster-version VERSION
Replace the following:
NODE_POOL_NAME
: the name of the node pool for which you want to resume the node pool upgrade.CLUSTER_NAME
: the name of the cluster of the node pool for which you want to resume the upgrade.CONTROL_PLANE_LOCATION
: the location (region or zone) for the control plane, such as us-central1
or us-central1-a
.VERSION
: the target version of the canceled node pool upgrade.For more information, refer to the gcloud container clusters upgrade
documentation.
You can roll back a node pool to downgrade the upgraded nodes to their original state from before the node pool upgrade started.
Use the rollback
command if an in-progress upgrade was cancelled, the upgrade failed, or the upgrade is incomplete due to a maintenance window timing out. Alternatively, if you want to specify the version, follow the instructions to downgrade the node pool.
To learn more about what happens when you roll back a node pool upgrade, see Roll back a surge upgrade or Roll back a blue-green upgrade.
To roll back an upgrade, run the following command:
gcloud container node-pools rollback NODE_POOL_NAME \
--cluster CLUSTER_NAME \
--location=CONTROL_PLANE_LOCATION
Replace the following:
NODE_POOL_NAME
: the name of the node pool for which to to roll back the node pool upgrade.CLUSTER_NAME
: the name of the cluster of the node pool for which to roll back the upgrade.CONTROL_PLANE_LOCATION
: the location (region or zone) for the control plane, such as us-central1
or us-central1-a
.Refer to the gcloud container node-pools rollback
documentation.
complete-upgrade
command is only possible with blue-green upgrades.
If you are using the blue-green upgrade strategy, you can complete a node pool upgrade during the Soak phase, skipping the rest of the soak time.
To learn how completing a node pool upgrade works, see Complete a node pool upgrade.
To complete an upgrade when using the blue-green upgrade strategy, run the following command:
gcloud container node-pools complete-upgrade NODE_POOL_NAME \
--cluster CLUSTER_NAME \
--location=CONTROL_PLANE_LOCATION
Replace the following:
NODE_POOL_NAME
: the name of the node pool for which you want to complete the upgrade.CLUSTER_NAME
: the name of the cluster of the node pool for which you want to complete the upgrade.CONTROL_PLANE_LOCATION
: the location (region or zone) for the control plane, such as us-central1
or us-central1-a
.Refer to the gcloud container node-pools complete-upgrade
documentation.
If you have PodDisruptionBudget
objects configured that are unable to allow any additional disruptions, node upgrades might fail to upgrade to the control plane version after repeated attempts. To prevent this failure, we recommend that you scale up the Deployment
or HorizontalPodAutoscaler
to allow the node to drain while still respecting the PodDisruptionBudget
configuration.
To see all PodDisruptionBudget
objects that do not allow any disruptions:
kubectl get poddisruptionbudget --all-namespaces -o jsonpath='{range .items[?(@.status.disruptionsAllowed==0)]}{.metadata.name}/{.metadata.namespace}{"\n"}{end}'
Although automatic upgrades might encounter the issue, the automatic upgrade process forces the nodes to upgrade. However, the upgrade takes an extra hour for every node in the istio-system
namespace that violates the PodDisruptionBudget.
For information about troubleshooting, see Troubleshoot cluster upgrades.
What's nextRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4