A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://run-ai-docs.nvidia.com/self-hosted/platform-management/monitor-performance/before-you-start below:

Before You Start | Run:ai Documentation

Before You Start | Run:ai Documentation
  1. Platform management
  2. Monitor Performance and Health
Before You Start

NVIDIA Run:ai provides metrics and telemetry for both physical cluster entities such as clusters, nodes, and node pools and application organization entities such as departments and projects. Metrics represent over-time data while telemetry represents current analytics data. This data is essential for monitoring and analyzing the performance and health of your platform.

Consuming Metrics and Telemetry Data

Users can consume the data based on their permissions:

  1. UI - Visualize the data through the NVIDIA Run:ai user interface.

Refer to metrics and telemetry to see the full list of supported metrics and telemetry APIs.

Use the list and describe commands to fetch and manage the data. See CLI reference for more details.

Describe a specific workload telemetry List projects and view their telemetry and metrics

Refer to metrics and telemetry to see the full list of supported metrics and telemetry.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4