Data volumes (DVs) are one type of workload assets. They offer a powerful solution for storing, managing, and sharing AI training data, promoting collaboration, simplifying data access control, and streamlining the AI development lifecycle.
Acting as a central repository for organizational data resources, data volumes can represent datasets or raw data, that is stored in Kubernetes Persistent Volume Claims (PVCs).
Once a data volume is created, it can be shared with additional multiple scopes and easily utilized by AI practitioners when submitting workloads. Shared data volumes are mounted with read-only permissions, ensuring data integrity. Any modifications to the data in a shared DV must be made by writing to the original volume of the PVC used to create the data volume.
Note
Data volumes is disabled by default. If unavailable, your administrator must enable it under General Settings → Workloads → Data volumes.
Creating data volumes is supported only for flexible workload submission.
Sharing with multiple scopes - Data volumes can be shared across different scopes in a cluster, including projects, departments. Using data volumes allows for data reuse and collaboration within the organization.
Storage saving - A single copy of the data can be used across multiple scopes
Sharing large datasets - In large organizations, the data is often stored in a remote location, which can be a barrier for large model training. Even if the data is transferred into the cluster, sharing it easily with multiple users is still challenging. Data volumes can help share the data seamlessly, with maximum security and control.
Sharing data with colleagues - When sharing training results, generated datasets, or other artifacts with team members is needed, data volumes can help make the data available easily.
To create a data volume, you must have a PVC data source already created. Make sure the PVC includes data before sharing it.
The data volumes table can be found under Workload manager in the NVIDIA Run:ai platform.
The data volumes table provides a list of all the data volumes defined in the platform and allows you to manage them.
The data volumes table comprises the following columns:
The name of the data volume
A description of the data volume
The different lifecycle phases and representation of the data volume condition
The scope of the data source within the organizational tree. Click the scope name to view the organizational tree diagram
The project of the origin PVC
The original PVC from which the data volume was created that points to the same PV
The cluster that the data volume is associated with
The user who created the data volume
The timestamp for when the data volume was created
The timestamp of when the data volume was last updated
The following table describes the data volumes' condition and whether they were created successfully for the selected scope.
No issues were found while creating the data volume
The data volume is being created
The data volume is being deleted
When the data volume’s scope is an account, the current version of the cluster is not up to date, or the asset is not a cluster-syncing entity, the status can’t be displayed
Customizing the Table ViewFilter - Click ADD FILTER, select the column to filter by, and enter the filter values
Search - Click SEARCH and type the value to search by
Sort - Click each column header to sort by
Column selection - Click COLUMNS and select the columns to display in the table
Refresh - Click REFRESH to update the table with the latest data
To create a new data volume:
Set the project where the data is located
Set a PVC from which to create the data volume
Enter a name for the data volume. The name must be unique.
Optional: Provide a description of the data volume
Set the Scopes that will be able to mount the data volume
To edit a data volume:
Select the data volume you want to edit
To copy an existing data volume:
Select the data volume you want to copy
Enter a name for the data volume. The name must be unique.
Set a new Origin PVC for your data volume, since only one Origin PVC can be used per data volume
To delete a data volume:
Select the data volume you want to delete
Confirm you want to delete the data volume
Note
It is not possible to delete a data volume being used by an existing workload.
To view the available actions, go to the Data volumes API reference.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4