A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://run-ai-docs.nvidia.com/self-hosted/workloads-in-nvidia-run-ai below:

Introduction to Workloads | Run:ai Documentation

Introduction to Workloads | Run:ai Documentation
  1. Workloads in NVIDIA Run:ai
Introduction to Workloads

NVIDIA Run:ai enhances visibility and simplifies management, by monitoring, presenting and orchestrating all AI workloads in the clusters it is installed. Workloads are the fundamental building blocks for consuming resources, enabling AI practitioners such as researchers, data scientists and engineers to efficiently support the entire life cycle of an AI initiative.

Workloads Across the AI Life cycle

A typical AI initiative progresses through several key stages, each with distinct workloads and objectives. With NVIDIA Run:ai, research and engineering teams can host and manage all these workloads to achieve the following:

A workload runs in the cluster, is associated with a namespace, and operates to fulfill its targets, whether that is running to completion for a batch job, allocating resources for experimentation in an integrated development environment (IDE)/notebook, or serving inference requests in production.

The workload, defined by the AI practitioner, consists of:

Workload Scheduling and Orchestration

NVIDIA Run:ai’s core mission is to optimize AI resource usage at scale. This is achieved through efficient scheduling and orchestrating of all cluster workloads using the NVIDIA Run:ai Scheduler. The Scheduler allows the prioritization of workloads across different departments and projects within the organization at large scales, based on the resource distribution set by the system administrator.

NVIDIA Run:ai and Third-Party Workloads

Different types of workloads have different levels of support. Understanding what capabilities are needed before selecting the workload type to work with is important. The table below details the level of support for each workload type in NVIDIA Run:ai. NVIDIA Run:ai workloads are fully supported with all of NVIDIA Run:ai advanced features and capabilities. While third-party workloads are partially supported. The list of capabilities can change between different NVIDIA Run:ai versions.

NVIDIA Run:ai Training - Standard NVIDIA Run:ai Training - distributed

Workload awareness

Specific workload-aware visibility, so that different pods are identified and treated as a single workload (for example GPU utilization, workload view, dashboards).


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4