A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview below:

GKE Autopilot overview | Google Kubernetes Engine (GKE)

Autopilot

This page describes Google Kubernetes Engine (GKE) Autopilot mode, a managed mode of GKE clusters. It provides the information you need to understand the benefits and considerations of Autopilot mode. It covers topics such as planning and creating Autopilot clusters, deploying and managing applications, configuring networking, ensuring security, and scaling workloads.

This page is for Admins, Architects, and Operators who make informed decisions when evaluating how GKE Autopilot mode aligns with the operational requirements of their containerized workloads. To learn more about common roles and example tasks referenced in Google Cloud content, see Common GKE user roles and tasks.

Before reading this page, you can familiarize yourself with Compare GKE Autopilot and Standard.

What is Autopilot?

GKE Autopilot is a mode of operation in GKE in which Google manages your cluster configuration, including your nodes, scaling, security, and other preconfigured settings. Autopilot clusters are optimized to run most production workloads, and provision compute resources based on your Kubernetes manifests. The streamlined configuration follows GKE best practices and recommendations for cluster and workload setup, scalability, and security. For a list of built-in settings, refer to the Autopilot and Standard comparison table.

Best practice:

Use Autopilot for a fully managed Kubernetes experience.

Pricing

In most situations, you only pay for the CPU, memory, and storage that your workloads request while running on GKE Autopilot. You aren't billed for unused capacity on your nodes, because GKE manages the nodes. Note that exceptions to this pricing model exist when you run Pods on specific compute classes that let Pods use the full resource capacity of the node virtual machine (VM).

You aren't charged for system Pods, operating system costs, or unscheduled workloads. For detailed pricing information, refer to Autopilot pricing.

Benefits

Autopilot comes with a SLA that covers both the control plane and the compute capacity used by your Pods.

About the Autopilot container-optimized compute platform

In GKE version 1.32.3-gke.1927002 and later, Autopilot includes a specialized container-optimized compute platform for your workloads. This platform works well for most general-purpose workloads that don't require specific hardware, such as web servers and medium-intensity batch jobs.

The container-optimized compute platform uses GKE Autopilot nodes that can dynamically resize while running, designed to scale up from fractions of a CPU with minimal disruptions. This dynamic resizing significantly reduces the time that's needed to provision new capacity as your workloads scale. To improve the speed of scaling and resizing, GKE also maintains a pool of pre-provisioned compute capacity that can be automatically allocated for workloads in response to increased resource demands.

As an Autopilot user, the container-optimized compute platform gives you the following benefits:

In Autopilot clusters, Pods that don't select specific hardware use this compute platform by default.

Plan your Autopilot clusters

Before you create a cluster, plan and design your Google Cloud architecture. In Autopilot, you request hardware in your workload specifications. GKE provisions and manages the corresponding infrastructure to run those workloads. For example, if you run machine learning workloads, you request hardware accelerators. If you develop Android apps, you request Arm CPUs.

Plan and request quota for your Google Cloud project or organization based on the scale of your workloads. GKE can only provision infrastructure for your workloads if your project has enough quota for that hardware.

Consider the following factors during planning:

The following sections provide information and useful resources for these considerations.

Networking

When you create an Autopilot cluster with public networking, workloads in the cluster can communicate with each other and with the internet. This is the default networking mode. Google Cloud and Kubernetes provide various additional networking features and capabilities that you can leverage based on your use case, such as clusters with private networking.

Networking in Kubernetes and in the cloud is complex. Before you start changing the defaults that Google Cloud sets for you, ensure that you're familiar with the basic concepts of networking. The following table provides you with resources to learn more about networking in GKE based on your use case:

Use case Resources Understand how networking works in Kubernetes and GKE

After you learn the networking model, consider your organization's networking and network security requirements. Choose GKE and Google Cloud networking features that satisfy those criteria.

Plan your GKE networking configuration

We recommend that you understand the networking quotas for GKE, such as endpoints per Service and API request limits. The following resources will help you to plan specific aspects of your networking setup:

Expose your workloads Run highly-available connected services in multiple clusters Use multi-cluster Services (MCS). Load balance incoming traffic Configure cluster network security Observe your Kubernetes network traffic

By default, Autopilot use GKE Dataplane V2 for metrics and observability .

Scaling

Operating a platform effectively at scale requires planning and careful consideration. You must consider the scalability of your design, which is the ability of your clusters to grow while remaining within service-level objectives (SLOs). For detailed guidance for both platform administrators and developers, refer to the Guidelines for creating scalable clusters.

You should also consider the GKE quotas and limits, especially if you plan to run large clusters with potentially thousands of Pods.

Scale Autopilot workloads

In Autopilot, GKE automatically scales your nodes based on the number of Pods in your cluster. If a cluster has no running workloads, Autopilot can automatically scale the cluster down to zero nodes. Following cluster scale-down, no nodes remain in the cluster and system Pods are consequently in an unschedulable state. This is expected behavior. In most newly created Autopilot clusters, you might notice that the first workloads that you deploy take more time to schedule. This is because the new Autopilot cluster starts with zero usable nodes upon creation and waits until you deploy a workload to provision additional nodes.

Best practice:

To automatically scale the number of Pods in your cluster, use a mechanism such as Kubernetes horizontal Pod autoscaling, which can scale Pods based on the built-in CPU and memory metrics, or based on custom metrics from Cloud Monitoring. To learn how to configure scaling based on various metrics, refer to Optimize Pod autoscaling based on metrics.

Security

Autopilot clusters enable and apply security best practices and settings by default, including many of the recommendations in Harden your cluster security and the GKE security overview.

If you want to learn more about Autopilot hardening measures and how to implement your specific security requirements, refer to Security measures in Autopilot.

Create a cluster

After planning your environment and understanding your requirements, create an Autopilot cluster. New Autopilot clusters are regional clusters that have a publicly accessible IP address. Each cluster has baseline hardening measures applied, as well as automatic scaling and other features. For a full list of pre-configured features, refer to Compare GKE Autopilot and Standard.

If you want to create the cluster with no access to external IP addresses, configure your network isolation.

Deploy workloads on Autopilot

To deploy a workload to a running Autopilot cluster, write a Kubernetes manifest and apply it to the cluster. By default, Autopilot clusters are optimized to run most production workloads.

For an interactive guide in the Google Cloud console for deploying and exposing an app, click Guide me:

Guide me

Some of your workloads might have specialized hardware requirements, such as ML workloads that need hardware accelerators or mobile app testing that requires the Arm architecture. Autopilot has predefined compute classes that Google Cloud has configured to run workloads that have special compute requirements. If you have more specific hardware requirements, you can define your own custom compute classes. When deploying these specialized workloads, request a compute class in the manifest. Autopilot automatically provisions nodes backed by specialized machines, manages scheduling, and allocates hardware.

The following table shows some common requirements and provides recommendations for what you should do:

Use case Resources Control individual node properties when scaling a cluster Deploy a custom compute class and request it in your workload manifest. For details, see About custom compute classes. Run Arm workloads Request the Scale-Out compute class and the arm64 architecture in your manifest. For instructions, refer to Deploy Autopilot workloads on Arm architecture. Run accelerated AI/ML workloads Request GPUs in your manifest. For instructions, refer to Deploy GPU workloads in Autopilot. Run workloads that require high compute or memory capacity Request the Balanced compute class. For instructions, refer to Choose compute classes for Autopilot Pods. Run workloads that require more efficient horizontal scaling of CPU capacity and single thread-per-core compute Request the Scale-Out compute class. For instructions, refer to Choose compute classes for Autopilot Pods. Run fault-tolerant workloads such as batch jobs at lower costs Specify Spot Pods in your manifest. For instructions, refer to Run fault-tolerant workloads at lower costs in Spot Pods. You can use any compute class or hardware configuration with Spot Pods. Run workloads that require minimal disruptions, such as game servers or work queues Specify the cluster-autoscaler.kubernetes.io/safe-to-evict=false annotation in the Pod specification. Pods are protected from eviction caused by node auto-upgrades or scale-down events for up to seven days. For instructions, see Extend the run time of Autopilot Pods. Let workloads burst beyond their requests if there are available, unused resources in the sum of Pod resource requests on the node. Set your resource limits higher than your requests or don't set resource limits. For instructions, see Configure Pod bursting in GKE.

Autopilot lets you request CPU, memory, and ephemeral storage resources for your workloads. The allowed ranges depend on whether you want to run your Pods on the default general-purpose compute platform, or on a compute class.

For information about the default container resource requests and the allowed resource ranges, refer to Resource requests in Autopilot.

Workload separation

Autopilot clusters support using node selectors and node affinity to configure workload separation. Workload separation is useful when you need to tell GKE to place workloads on nodes that meet specific criteria, such as custom node labels. For example, you can tell GKE to schedule game server Pods on nodes with the game-server label and avoid scheduling any other Pods on those nodes.

To learn more, refer to Configure workload separation in GKE.

Schedule Pods in specific zones using zonal topology

If you need to place Pods in a specific Google Cloud zone, for example to access information on a zonal Compute Engine persistent disk, see Place GKE Pods in specific zones.

Pod affinity and anti-affinity

Use Pod affinity and anti-affinity to co-locate Pods on a single node or to make some Pods avoid other Pods. Pod affinity and anti-affinity tell Kubernetes to make a scheduling decision based on the labels of Pods running on nodes in a specific topology domain, such as a specific region or zone. For example, you could tell GKE to avoid scheduling frontend Pods alongside other frontend Pods on the same nodes to improve availability in case of an outage.

For instructions and more details, refer to Pod affinity and anti-affinity.

In GKE, you can use Pod affinity and anti-affinity with the following labels in topologyKey:

Pod topology spread constraints

To improve the availability of your workloads as Kubernetes scales the number of Pods up and down, you can set Pod topology spread constraints. This controls how Kubernetes spreads your Pods across nodes within a topology domain, such as a region. For example, you could tell Kubernetes to place a specific number of game server session Pods in each of three Google Cloud zones in the us-central1 region.

For examples, more details, and instructions, refer to Pod Topology Spread Constraints.

Manage and monitor your Autopilot clusters

In Autopilot, GKE automatically manages cluster upgrades and maintenance for both the control plane and worker nodes. Autopilot clusters also have built-in functionality for you to monitor your clusters and workloads.

GKE version upgrades

All Autopilot clusters are enrolled in a GKE release channel. In release channels, GKE manages the Kubernetes version of the cluster, balancing between feature availability and version stability depending on the channel. By default, Autopilot clusters are enrolled in the Regular release channel, but you can select a different channel that meets your stability and functionality needs. For more information about release channels, see About release channels.

GKE automatically starts upgrades, monitors progress, and pauses the operation if problems occur. You can manually control the upgrade process in the following ways:

Monitor your Autopilot clusters

Autopilot clusters already have Cloud Logging, Cloud Monitoring, and Google Cloud Managed Service for Prometheus enabled.

Autopilot clusters collect the following types of logs and metrics automatically, adhering to Google's best practices for telemetry collection:

Logs for Cloud Logging

Metrics for Cloud Monitoring

No additional configuration is required to enable logging and monitoring. The following table shows you how to interact with the collected telemetry based on your requirements:

Use case Resources Understand and access your GKE logs Observe the performance of your GKE clusters

Effective monitoring of your cluster performance can help you to optimize the operating costs of your clusters and workloads.

Monitor the security posture of your clusters Use the security posture dashboard to audit your running workloads against GKE best practices, scan for vulnerabilities in your container operating systems and language packages, and get actionable mitigation recommendations. To learn more, see About the security posture dashboard. Troubleshooting

For troubleshooting steps, refer to Troubleshooting Autopilot clusters.

What's next

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4