A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://developer.nvidia.com/blog/accelerate-ai-model-orchestration-with-nvidia-runai-on-aws/ below:

Accelerate AI Model Orchestration with NVIDIA Run:ai on AWS

When it comes to developing and deploying advanced AI models, access to scalable, efficient GPU infrastructure is critical. But managing this infrastructure across cloud-native, containerized environments can be complex and costly. That’s where NVIDIA Run:ai can help. NVIDIA Run:ai is now generally available on AWS Marketplace, making it even easier for organizations to streamline their AI infrastructure management.

Built for Kubernetes-native environments, NVIDIA Run:ai acts as a control plane for GPU infrastructure, removing complexity and enabling organizations to scale AI workloads with speed, efficiency, and proper governance.

This post dives into how NVIDIA Run:ai orchestrates AI workloads and GPUs across Amazon Web Services (AWS). It integrates seamlessly with NVIDIA GPU-accelerated Amazon EC2 instances, Amazon Elastic Kubernetes Service (EKS), Amazon SageMaker HyperPod, AWS  Identity and Access Management (IAM), Amazon CloudWatch, and other AWS-native services.

The challenge: efficient GPU orchestration at scale

Modern AI workloads—from large-scale training to real-time inference—require dynamic access to powerful GPUs. But in Kubernetes environments, native support for GPUs is limited. Common challenges include:

The NVIDIA Run:ai solution

NVIDIA Run:ai addresses these challenges with a Kubernetes-based AI orchestration platform designed specifically for AI/ML workloads. It introduces a virtual GPU pool, enabling dynamic, policy-based scheduling of GPU resources.

Key capabilities: Figure 1. NVIDIA Run:ai Cluster and Control Plane with AWS How NVIDIA Run:ai works on AWS

NVIDIA Run:ai integrates seamlessly with NVIDIA-powered AWS services to optimize performance and simplify operations:

1. Amazon EC2 GPU-accelerated instances within Kubernetes clusters (NVIDIA A10G, A100, H100, etc.)

NVIDIA Run:ai schedules AI workloads on Kubernetes clusters that are deployed on EC2 instances with NVIDIA GPUs. That maximizes GPU utilization through intelligent sharing and bin packing.

2. Amazon EKS (Elastic Kubernetes Service)

NVIDIA Run:ai integrates natively with Amazon EKS, providing a robust scheduling and orchestration layer that’s purpose-built for AI workloads. It maximizes the utilization of GPU resources in Kubernetes clusters.

3. Amazon Sagemaker HyperPod

NVIDIA Run:ai integrates with Amazon SageMaker HyperPod to seamlessly extend AI infrastructure across both on-premise and public/private cloud environments. 

Integrating with Amazon CloudWatch

Monitoring GPU workloads at scale requires real-time observability. NVIDIA Run:ai can be integrated with Amazon CloudWatch to provide:

By combining NVIDIA Run:ai’s rich workload telemetry with CloudWatch’s analytics and alerting, users gain actionable insights into resource consumption and efficiency.

Integrating with AWS IAM

Security and governance are foundational for AI infrastructure. NVIDIA Run:ai integrates with AWS IAM to:

IAM integration ensures that only authorized users and services can access or manage NVIDIA Run:ai resources within your AWS environment.

Example: multi-team GPU orchestration on EKS

Imagine an enterprise AI platform with three teams: natural language processing (NLP), computer vision, and generative AI. Each team needs guaranteed GPU access for training, while also running inference jobs on shared infrastructure.

With NVIDIA Run:ai:

This model allows AI teams to move faster without stepping on each other’s toes—or burning the budget on underutilized GPUs.

Figure 2. NVIDIA Run:ai Dashboard Get started

As enterprises scale their AI efforts, managing GPU infrastructure manually becomes unsustainable. NVIDIA Run:ai, in combination with NVIDIA technologies on AWS, offers a powerful orchestration layer that simplifies GPU management, boosts utilization, and accelerates AI innovation.

With native integration into EKS, EC2, IAM, SageMaker HyperPod, and CloudWatch, NVIDIA Run:ai provides a unified, enterprise-ready foundation for AI/ML workloads in the cloud.

To learn more or deploy NVIDIA Run:ai on your AWS environment, visit the NVIDIA Run:ai listing on AWS Marketplace or explore the NVIDIA Run:ai documentation.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4