RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://run-ai-docs.nvidia.com/self-hosted/workloads-in-nvidia-run-ai/using-inference/nvcf below:

Deploy NVIDIA Cloud Functions (NVCF) in NVIDIA Run:ai

Deploy NVIDIA Cloud Functions (NVCF) in NVIDIA Run:ai | Run:ai Documentation

Deploy NVIDIA Cloud Functions (NVCF) in NVIDIA Run:ai

NVIDIA Cloud Functions (NVCF) is a serverless API platform designed to deploy and manage AI workloads on GPUs. Through its integration with NVIDIA Run:ai, NVCF can be deployed directly onto NVIDIA Run:ai-managed GPU clusters. This allows users to take advantage of NVIDIA Run:ai's scheduling, quota management, and monitoring features. See Supported features for more details.

This guides provides the required steps for integrating NVIDIA Cloud Functions with the NVIDIA Run:ai platform.

By default, inference workloads in NVIDIA Run:ai are assigned a priority of very-high, which is non-preemptible. This behavior ensures that inference workloads, which often serve real-time or latency-sensitive traffic, are guaranteed the resources they need and will not be disrupted by other workloads. For more details, see Workload priority control.

Note

Changing the priority is not supported for NVCF workloads.

Follow the official instructions provided in the NVIDIA Cloud Functions documentation.

Setting up a Cluster in NVCF

Cloud Functions administrators can install the NVIDIA Cluster Agent to enable existing GPU clusters as deployment targets for NVCF functions. Once installed, the cluster appears as a deployment option in the API and Cloud Functions menu, allowing authorized functions to deploy on it. See Cluster Setup & Management to register, configure and verify the cluster.

Setting up a Project in NVIDIA Run:ai

Once the cluster is registered to NVCF and appears as ready, create a project in the NVIDIA Run:ai UI with an NVCF namespace:

Follow the instructions detailed in Projects to create a new project.
When setting the Namespace, choose "Enter existing namespace from the cluster" and enter nvcf-backend.
Assign the necessary resource quotas to the project.

Note

This is the designated NVIDIA Run:ai project for all NVCF functions; other projects will not be used.
NVCF is assigned to a specific project and adheres to the project's resource quota. Ensure that your project has sufficient quota allocated to accommodate.

In NVCF, function creation defines the code and resources while deployment registers the function to a GPU cluster, making it available for execution:

Note

Using a custom Helm chart when creating a Cloud Function is not supported.

After the NVCF function is deployed, it is added to the Workloads table, where it can be managed and monitored:

Monitor resource usage, performance, and execution status.
Manage workload lifecycle, including scaling, logging, and troubleshooting.

NVIDIA Run:ai Functionality

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4