In this page, you learn how to create a Google Kubernetes Engine (GKE) cluster with node pools running Microsoft Windows Server. With this cluster, you can use Windows Server containers. Microsoft Hyper-V containers are not currently supported. Similar to Linux containers, Windows Server containers provide process and namespace isolation.
A Windows Server node requires more resources than a typical Linux node. Windows Server nodes need the extra resources to run the Windows OS and for the Windows Server components that cannot run in containers. Since Windows Server nodes require more resources, your allocatable resources are lower than they would be with Linux nodes.
Creating a cluster using Windows Server node poolsIn this section, you create a cluster that uses a Windows Server container.
To create this cluster you need to complete the following tasks:
gcloud
.kubectl
credentials.gcloud
commands for these tasks assume you are using a Linux machine. Set up IAM service accounts for GKE
GKE uses IAM service accounts that are attached to your nodes to run system tasks like logging and monitoring. At a minimum, these node service accounts must have the Kubernetes Engine Default Node Service Account (roles/container.defaultNodeServiceAccount
) role on your project. By default, GKE uses the Compute Engine default service account, which is automatically created in your project, as the node service account.
To grant the roles/container.defaultNodeServiceAccount
role to the Compute Engine default service account, complete the following steps:
PROJECT_NUMBER-compute@developer.gserviceaccount.comReplace
PROJECT_NUMBER
with the project number that you copied.gcloud projects describe PROJECT_ID \ --format="value(projectNumber)"
Replace PROJECT_ID
with your project ID.
The output is similar to the following:
12345678901
roles/container.defaultNodeServiceAccount
role to the Compute Engine default service account:
gcloud projects add-iam-policy-binding PROJECT_ID \ --member="serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com" \ --role="roles/container.defaultNodeServiceAccount"
Replace PROJECT_NUMBER
with the project number from the previous step.
To run on GKE, Windows Server container node images need to be built on Windows Server version 2019 (LTSC), Windows Server version 20H2 (SAC), or Windows Server version 2022 (LTSC). A single cluster can have multiple Windows Server node pools using different Windows Server versions, but each individual node pool can only use one Windows Server version.
Consider the following when choosing your node image:
gcloud container get-server-config
command as described in the Mapping GKE and Windows versions section.Container runtime:
For both the Windows Server LTSC and SAC node images, the container runtime can be Docker or containerd. For GKE node version 1.21.1-gke.2200 and later, we recommend using the containerd runtime. For more information, see Node images.
Warning: In GKE version 1.24 and later, Docker-based node image types are not supported. In GKE version 1.23, you also cannot create new node pools with Docker node image types. You must migrate to a containerd node image type. To learn more about this change, see About the Docker node image deprecation.gcloud
Before you start, make sure that you have performed the following tasks:
gcloud components update
. Note: For existing gcloud CLI installations, make sure to set the compute/region
property. If you use primarily zonal clusters, set the compute/zone
instead. By setting a default location, you can avoid errors in the gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location
. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.To run Windows Server containers, your cluster must have at least one Windows and one Linux node pool. You cannot create a cluster using only a Windows Server node pool. The Linux node pool is required to run critical cluster add- ons.
Because of its importance, we recommend turning on autoscaling to ensure your Linux node pool has sufficient capacity to run cluster add-ons.
Note: Clusters using Windows Server node pools do not support all Kubernetes and GKE features. See the limitations section for more information. gcloudCreate a cluster with the following fields:
gcloud container clusters create CLUSTER_NAME \
--location=CONTROL_PLANE_LOCATION \
--enable-ip-alias \
--num-nodes=NUMBER_OF_NODES \
--cluster-version=VERSION_NUMBER \
--release-channel CHANNEL
Replace the following:
CLUSTER_NAME
: the name you choose for your cluster.CONTROL_PLANE_LOCATION
: the Compute Engine location of the control plane of your cluster. Provide a region for regional clusters, or a zone for zonal clusters.--enable-ip-alias
turns on alias IP. Alias IP is required for Windows Server nodes. To read more about its benefits, see Understanding native container routing with Alias IPs.NUMBER_OF_NODES
: the number of Linux nodes you create. You should provide sufficient compute resources to run cluster add-ons. This is an optional field and if omitted, uses the default value of 3
.VERSION_NUMBER
: the specific cluster version you want to use, which must be 1.16.8-gke.9 or higher. If you do not specify a release channel, GKE enrolls your cluster in the most mature release channel where that version is available.CHANNEL
: the release channel to enroll the cluster in, which can be one of rapid
, regular
, stable
, or None
. By default, the cluster is enrolled in the regular
release channel unless at least one of the following flags is specified: --cluster-version
, --release-channel
, --no-enable-autoupgrade
, and --no-enable-autorepair
. You must specify None
if you choose a cluster version and do not want your cluster to be enrolled in a release channel.We strongly recommend that you specify a minimally-privileged IAM service account that your nodes can use instead of the Compute Engine default service account. To learn how to create a minimally-privileged service account, see Use a least privilege service account.
To specify a custom service account in the gcloud CLI, add the following flag to your command:
--service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com
Replace SERVICE_ACCOUNT_NAME with the name of your minimally-privileged service account.
Create the Windows Server node pool with the following fields:
gcloud container node-pools create NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--location=CONTROL_PLANE_LOCATION \
--image-type=IMAGE_NAME \
--no-enable-autoupgrade \
--machine-type=MACHINE_TYPE_NAME \
--windows-os-version=WINDOWS_OS_VERSION
Replace the following:
NODE_POOL_NAME
: the name you choose for your Windows Server node pool.CLUSTER_NAME
: the name of the cluster you created above.CONTROL_PLANE_LOCATION
: the Compute Engine location of the control plane of your cluster. Provide a region for regional clusters, or a zone for zonal clusters.
IMAGE_NAME
: You can specify one of the following values:
WINDOWS_LTSC_CONTAINERD
: Windows Server LTSC with containerd. This is the image type for both Windows Server 2022 and Windows Server 2019 OS imageWINDOWS_SAC_CONTAINERD
: Windows Server SAC with containerd (Unsupported after August 9, 2022)WINDOWS_LTSC
: Windows Server LTSC with DockerWINDOWS_SAC
: Windows Server SAC with Docker (Unsupported after August 9, 2022)For more information about these node images, see the Choose your Windows node image section.
--no-enable-autoupgrade
disables node auto-upgrade. Review Upgrading Windows Server node pools before enabling.
MACHINE_TYPE_NAME
: defines the machine type. n1-standard-2
is the minimum recommended machine type as Windows Server nodes require additional resources. Machine types f1-micro
and g1-small
are not supported. Each machine type is billed differently. For more information, refer to the machine type price sheet.
WINDOWS_OS_VERSION
: defines the Windows OS version to use for image type WINDOWS_LTSC_CONTAINERD
. This is an optional flag. When not specified, the default OS version used will be LTSC2019. Set the value to ltsc2022
to create a Windows Server 2022 node pool. Set the value to ltsc2019
to create a Windows Server 2019 node pool.
WINDOWS_LTSC_CONTAINERD
. The default LTSC image that GKE uses to create Windows_LTSC_Containerd node pools through the Google Cloud console or CLI is ltsc2019
. Creation of Windows Server 2022 node images are only supported for GKE version 1.25.3-gke.800
or later. If you want your clusters to consume Windows Server 2022 node pools, make sure to upgrade your clusters to the supported GKE versions.
The following example shows how you can create a Windows Server 2022 node pool:
gcloud container node-pools create node_pool_name \
--cluster=cluster_name \
--location=us-central1 \
--image-type=WINDOWS_LTSC_CONTAINERD \
--windows-os-version=ltsc2022
The following example shows how you can update an existing Windows node pool to use Windows Server 2022 OS image:
gcloud container node-pools create node_pool_name \
--cluster=cluster_name \
--location=us-central1 \
--windows-os-version=ltsc2022
Console
From the navigation pane, under Node Pools, click Nodes.
From the Image type drop-down list, select one of the following node images:
For more information, see the Choose your Windows node image section.
Choose the default Machine configuration to use for the instances. n1-standard-2
is the minimum recommended size as Windows Server nodes require additional resources. Machine types f1-micro
and g1-small
are not supported. Each machine type is billed differently. For more information, refer to the machine type price sheet.
From the navigation pane, select the name of your Windows Server node pool. This returns you to the Node pool details page.
From the navigation pane, under Cluster, select Networking.
Click Create.
To create a GKE Standard cluster and a Windows Server node pool using Terraform, refer to the following example:
This example uses Windows Server LTSC with containerd. This is the image type for both Windows Server 2022 and Windows Server 2019 OS image. For more information about node images, see Choose your Windows node image.
To learn more about using Terraform, see Terraform support for GKE.
After you create a Windows Server node pool, the cluster goes into a RECONCILE
state for several minutes as the control plane is updated.
Use the get-credentials
command to enable kubectl
to work with the cluster you created.
gcloud container clusters get-credentials CLUSTER_NAME \
--location CONTROL_PLANE_LOCATION
For more information on the get-credentials
command, see the SDK get-credentials documentation.
Before using the cluster, wait for several seconds until windows.config.common-webhooks.networking.gke.io
is created. This webhook adds scheduling tolerations to Pods created with the kubernetes.io/os: windows
node selector to ensure they are allowed to run on Windows Server nodes. It also validates the Pod to ensure that it only uses features supported on Windows.
To ensure the webhook is created, run the following command:
kubectl get mutatingwebhookconfigurations
The output should show the webhook running:
NAME CREATED AT
windows.config.common-webhooks.networking.gke.io 2019-12-12T16:55:47Z
Now that you have a cluster with two node pools (one Linux and one Windows), you can deploy a Windows application.
Mapping GKE and Windows versionsPreview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions.
Microsoft releases new SAC versions approximately every six months and new LTSC versions every two to three years. These new versions are typically available in new GKE minor versions. Within a GKE minor version the LTSC and SAC versions usually remain fixed.
To see the version mapping between GKE versions and Windows Server versions, use the gcloud beta container get-server-config
command:
gcloud beta container get-server-config
The version mapping is returned in the windowsVersionMaps
field of the response. To filter the response to see the version mapping for specific GKE versions in your cluster, perform the following steps in a Linux shell or in Cloud Shell.
Set the following variables:
CLUSTER_NAME=CLUSTER_NAME \
NODE_POOL_NAME=NODE_POOL_NAME \
CONTROL_PLANE_LOCATION=CONTROL_PLANE_LOCATION
Replace the following:
CLUSTER_NAME
: the name of your cluster.NODE_POOL_NAME
: the name of the Windows Server node pool.CONTROL_PLANE_LOCATION
: the Compute Engine location of the control plane of your cluster. Provide a region for regional clusters, or a zone for zonal clusters.Obtain the node pool version and store it in the NODE_POOL_VERSION
variable:
NODE_POOL_VERSION=`gcloud container node-pools describe $NODE_POOL_NAME \
--cluster=$CLUSTER_NAME \
--location=$CONTROL_PLANE_LOCATION \
--format="value(version)"`
Obtain the Windows Server versions for NODE_POOL_VERSION
:
gcloud beta container get-server-config \
--location=$CONTROL_PLANE_LOCATION \
--format="yaml(windowsVersionMaps.\"$NODE_POOL_VERSION\")"
The output is similar to the following:
windowsVersionMaps:
1.18.6-gke.6601:
windowsVersions:
- imageType: WINDOWS_SAC
osVersion: 10.0.18363.1198
supportEndDate:
day: 10
month: 5
year: 2022
- imageType: WINDOWS_LTSC
osVersion: 10.0.17763.1577
supportEndDate:
day: 9
month: 1
year: 2024
Obtain the Windows Server version for the WINDOWS_SAC
image type:
gcloud beta container get-server-config \
--flatten=windowsVersionMaps.\"$NODE_POOL_VERSION\".windowsVersions \
--filter="windowsVersionMaps.\"$NODE_POOL_VERSION\".windowsVersions.imageType=WINDOWS_SAC" \
--format="value(windowsVersionMaps.\"$NODE_POOL_VERSION\".windowsVersions.osVersion)"
The output is similar to the following:
10.0.18363.1198
The Windows Server container version compatibility requirements mean that your container images might need to be rebuilt to match the Windows Server version for a new GKE version before upgrading your node pools.
To ensure that your container images remain compatible with your nodes, we recommend that you check the version mapping and build your Windows Server container images as multi-arch images that can target multiple Windows Server versions. You can then update your container deployments to target the multi-arch images that will work on both the current and the next GKE version before manually invoking a GKE node pool upgrade. Manual node pool upgrades must be performed regularly because nodes cannot be more than two minor versions behind the control plane version.
We recommend that you subscribe to upgrade notifications using Pub/Sub to proactively receive updates about new GKE versions and the Windows OS versions they use.
We recommend enabling node auto-upgrades only if you continuously build multi-arch Windows Server container images that target the latest Windows Server versions, especially if you are using Windows Server SAC as the node image type. Node auto-upgrades are less likely to cause problems with the Windows Server LTSC node image type but there is still a risk of encountering version incompatibility issues.
Windows UpdatesWindows Updates are disabled for Windows Server nodes. Automatic updates can cause node restarts at unpredictable times, and any Windows Updates installed after a node starts would be lost when the node is recreated by GKE. GKE makes Windows Updates available by periodically updating the Windows Server node images used in new GKE releases. There can be a delay between when Windows Updates are released by Microsoft and when they are available in GKE. When critical security updates are released, GKE updates the Windows Server node images as quickly as possible.
Control how Windows Pods and Services communicateYou can control how Windows Pods and Services communicate using network policies.
You can have a Windows Server container on clusters that have network policy enabled in GKE versions 1.22.2 and later. This feature is available for clusters that use the WINDOWS_LTSC
or WINDOWS_LTSC_CONTAINERD
node image types.
If your control planes or nodes are running earlier versions, you can migrate your node pools to a version that supports network policy by upgrading your node pools and your control plane to GKE version 1.22.2 or later. This option is only available if you created your cluster with the --enable-dataplane-v2
flag.
After you enable network policy, all previously configured policies, including policies that did not work on Windows Server containers before you enabled the feature, become active.
Some clusters cannot be used with Windows Server containers on clusters with network policy enabled. See the limitations section for more details.
Viewing and querying logsLogging is enabled automatically in GKE clusters. You can view the logs of the containers and the logs from other services on the Windows Server nodes using Kubernetes Engine monitoring.
The following is an example of a filter to get the container log:
resource.type="k8s_container"
resource.labels.cluster_name="your_cluster_name"
resource.labels.namespace_name="your_namespace_id"
resource.labels.container_name="your_container_name"
resource.labels.Pod_name="your_Pod_name"
Accessing a Windows Server node using Remote Desktop Protocol (RDP)
You can connect to a Windows Server node in your cluster using RDP. For instructions on how to connect, see Connecting to Windows instances in the Compute Engine documentation.
Building multi-arch imagesYou can build the multi-arch images manually or use a Cloud Build builder. For instructions, see Building Windows multi-arch images.
Using gMSAThe following steps show you how to use a Group Managed Service Account (gMSA) with your Windows Server node pools.
Configure Windows Server nodes in your cluster to automatically join your AD domain. For instructions, see Configure Windows Server nodes to automatically join an Active Directory domain.
Create and grant a gMSA access to the security group automatically created by the domain join service. This step needs to be done in a machine with administrative access to your AD domain.
$instanceGroupUri = gcloud container node-pools describe NODE_POOL_NAME --cluster CLUSTER_NAME --format="value(instanceGroupUrls)"
$securityGroupName = ([System.Uri]$instanceGroupUri).Segments[-1]
$securityGroup = dsquery group -name $securityGroupName
$gmsaName = GMSA_NAME
$dnsHostName = DNS_HOST_NAME
New-ADServiceAccount -Name $gmsaName -DNSHostName $dnsHostName -PrincipalsAllowedToRetrieveManagedPassword $securityGroup
Get-ADServiceAccount $gmsaName
Test-ADServiceAccount $gmsaName
Replace the following:
NODE_POOL_NAME
: the name of your Windows Server node pool. The automatically created security group has the same name as your Windows Server node pool.CLUSTER_NAME
: the name of your cluster.GMSA_NAME
: the name you choose for the new gMSA.DNS_HOST_NAME
: the Fully Qualified Domain Name (FQDN) of the service account you created. For example, if GMSA_NAME
is webapp01
and the domain is example.com
, then DNS_HOST_NAME
is webapp01.example.com
.Configure your gMSA by following the instructions in the Configure GMSA for Windows Pods and containers tutorial.
Delete a Windows Server node pool by using gcloud
or the Google Cloud console.
gcloud container node-pools delete NODE_POOL_NAME \
--cluster=CLUSTER_NAME
--location=CONTROL_PLANE_LOCATION
Console
To delete a Windows Server node pool using the Google Cloud console, perform the following steps:
Go to the Google Kubernetes Engine page in the Google Cloud console.
Beside the cluster you want to edit, click more_vert Actions, then click edit Edit.
Select the Nodes tab.
Under the Node Pools section, click delete Delete next to the node pool you want to delete.
When prompted to confirm, click Delete again.
There are some Kubernetes features that are not yet supported for Windows Server containers. In addition, some features are Linux-specific and do not work for Windows. For the complete list of supported and unsupported Kubernetes features, see the Kubernetes documentation.
In addition to the unsupported Kubernetes features, there are some GKE features that are not supported.
For GKE clusters, the following features are not supported with Windows Server node pools:
--enable-tpu
)--enable-intra-node-visibility
)--enable-kubernetes-alpha
)service.spec.sessionAffinity
--accelerator
)Local External Traffic Policy on Windows node pool is only supported with GKE version v1.23.4-gke.400 or later.
Other Google Cloud products that you want to use with GKE clusters might not support Windows Server node pools. For specific limitations, refer to the documentation of that product.
TroubleshootingSee the Kubernetes documentation for general guidance on debugging Pods and Services.
Containerd node issuesFor known issues using a Containerd node image, see Known issues.
Windows Pods fail to startA version mismatch between the Windows Server container and the Windows node that is trying to run the container can result in your Windows Pods failing to start.
If the version for your Windows node pool is 1.16.8-gke.8 or later, review Microsoft's documentation for the February 2020 Windows Server container incompatibility issue and build your container images with base Windows images that include Windows Updates from March 2020. Container images built on earlier base Windows images might fail to run on these Windows nodes and can also cause the node to fail with status NotReady
.
Windows Server container images, and the individual layers they are composed of, can be quite large. Their size can cause Kubelet to timeout and fail when downloading and extracting the container layers.
You might have encountered this problem if you see the "Failed to pull image" or "Image pull context cancelled" error messages or an ErrImagePull
status for your Pods.
If the pull image occurs frequently, you should use node pools with a higher CPU specification. Container extraction is executed in parallel across cores, so machine types with more cores reduces the overall pull time.
Try the following options to successfully pull your Windows Server containers:
Break the application layers of the Windows Server container image into smaller layers that can each be pulled and extracted more quickly. This can make Docker's layer caching more effective and make image pull retries more likely to succeed. To learn more about layers, see the Docker article Storage drivers.
Connect to your Windows Server nodes and manually use the docker pull
command on your container images before creating your Pods.
Set the image-pull-progress-deadline
flag for the kubelet
service to increase the timeout for pulling container images.
Set the flag by connecting to your Windows nodes and running the following PowerShell commands.
Note: This procedure requires akubelet restart
, which can disrupt Pods running on the node.
Get the existing command line for the Kubelet service from the Windows registry.
PS C:\> $regkey = "HKLM\SYSTEM\CurrentControlSet\Services\kubelet"
PS C:\> $name = "ImagePath"
PS C:\> $(reg query ${regkey} /v ${name} | Out-String) -match ` "(?s)${name}.*(C:.*kubelet\.exe.*)"
PS C:\> $kubelet_cmd = $Matches[1] -replace ` "--image-pull-progress-deadline=.* ","" -replace "\r\n"," "
Set a new command line for the Kubelet service, with an additional flag to increase the timeout.
PS C:\> reg add ${regkey} /f /v ${name} /t REG_EXPAND_SZ /d "${kubelet_cmd} ` --image-pull-progress-deadline=40m "
Confirm that the change was successful.
PS C:\> reg query ${regkey} /v ${name}
Restart the kubelet
service so the new flag takes effect.
PS C:\> Restart-Service kubelet
Confirm that the kubelet
service restarted successfully.
PS C:\> Get-Service kubelet # ensure state is Running
When creating a node pool with a Windows image, you receive an error similar to the following:
WINDOWS_SAC image family for 1.18.20-gke.501 has reached end of life, newer versions are still available.
To resolve this error, choose a Windows image that is available and supported. You can find the support end date for GKE Windows node images by using the gcloud container get-server-config
command as described in the Mapping GKE and Windows versions section.
Node pool creation can time out if you are creating a large number of nodes (for example, 500) and it's the first node pool in the cluster using a Windows Server image.
To resolve this issue, reduce the number of nodes you are creating. You can increase the number of nodes later.
Windows nodes becomeNotReady
with error: "PLEG is not healthy"
This is a known Kubernetes issue that happens when multiple Pods are started very rapidly on a single Windows node. To recover from this situation, restart the Windows Server node. A recommended workaround to avoid this issue is to limit the rate at which Windows Pods are created to one Pod every 30 seconds.
Inconsistent TerminationGracePeriodThe Windows system timeout for the container might differ from the grace period you configure. This difference can cause Windows to force-terminate the container before the end of the grace period passed to the runtime.
You can modify the Windows timeout by editing container-local registry keys at image-build time. If you modify the Windows timeout, you might also need to adjust TerminationGracePeriodSeconds to match.
Network connectivity problemsIf you experience network connectivity problems from your Windows Server containers, it might be because Windows Server container networking often assumes a network MTU of 1500
, which is incompatible with Google Cloud's MTU of 1460
.
Check that both the MTU of the network interface in the container and the network interfaces of the Windows Server node itself are set to the same value (that is, 1460
or less). For information on how to set the MTU, see known issues for Windows containers.
If nodes fail to start in the cluster or fail to join the cluster successfully, review the diagnostic information provided in the node's serial port output.
Run the following command to see the serial port output:
gcloud compute instances get-serial-port-output NODE_NAME --zone=COMPUTE_ZONE
Replace the following:
NODE_NAME
: the name of the node.COMPUTE_ZONE
: the compute zone for the specific node.When starting Windows nodes in Kubernetes clusters with a high number of Host Network Service Load Balancer rules, there is a delay in processing the rules. Services are intermittently unreachable during the delay, which lasts around 30 seconds per rule, and the total delay can be significant if there are enough rules. To learn more, see the original issue in GitHub.
For GKE clusters running version 1.24 or earlier, with any Windows nodes that had an event that restarted kube-proxy
—for example, node startup, node upgrade, manual restart—any Services being reached by a Pod running on that node will be unreachable until all rules are synced by the component.
For GKE clusters running version 1.25 or later, this behavior is substantially improved. For details on this improvement, see the pull request in GitHub. If you are experiencing this issue, we recommend upgrading your cluster's control plane to 1.25 or later.
What's nextRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4