Beta
This feature is in Beta.
This article describes serverless GPU compute on Databricks and provides recommended use cases, guidance for how to set up GPU compute resources, and feature limitations.
What is serverless GPU compute?âServerless GPU compute is part of the Serverless compute offering. Serverless GPU compute is specialized for custom single and multi-node deep learning workloads. You can use serverless GPU compute to train and fine-tune custom models using your favorite frameworks and get state-of-the-art efficiency, performance, and quality.
Serverless GPU compute includes:
The pre-installed packages on serverless GPU compute are not a replacement for Databricks Runtime ML. While there are common packages, not all Databricks Runtime ML dependencies and libraries are reflected in the serverless GPU compute environment.
Recommended use casesâDatabricks recommends serverless GPU compute for any model training use case that requires training customizations and GPUs.
For example:
us-west-2
or us-east-1
.Serverless GPU compute for notebooks uses environment versions, which provide a stable client API to ensure application compatibility. This allows Databricks to upgrade the server independently, delivering performance improvements, security enhancements, and bug fixes without requiring any code changes to workloads.
Serverless GPU compute uses environment version 3 in addition to the following packages:
CUDA 12.4
torch 2.6.0
torchvision 0.21.0
See Serverless environment version 3 for the packages included in system environment version 3.
Also see the Serverless GPU Python API documentation.
Add libraries to the environmentâYou can install additional libraries to the serverless GPU compute environment. See Add dependencies to the notebook.
Set up serverless GPU computeâYou can select to use a serverless GPU compute from the notebook environment in your workspace.
After you open your notebook:
note
Connection to your compute auto-terminates after 60 minutes of inactivity.
Create and schedule a jobâThe following steps show how to create and schedule jobs for your serverless GPU compute workloads. See Create and manage scheduled notebook jobs for more details.
After you open the notebook you want to use:
You can also create and schedule jobs from the Jobs and pipelines UI. See Create a new job for step-by-step guidance.
LimitationsâServerless GPU compute only supports A10 or similar compute.
PrivateLink is not supported. Storage or pip repos behind PrivateLink are not supported.
Serverless GPU compute is not supported for compliance security profile workspaces (like HIPAA or PCT). Processing regulated data is not supported at this time.
Serverless GPU compute is only supported on interactive environments.
Scheduled jobs on Serverless GPU compute:
The notebooks in this section are examples to help demonstrate how to use Serverless GPU compute for different scenarios.
Deep learning with PyTorchâThe following notebook provides a simple example of how to run deep learning training using PyTorch and serverless GPU compute.
Deep learning training using PyTorch notebook Distributed training and hyperparameter sweepsâThe following notebook provides an example of distributed training and hyperparameter sweeps fine-tuning using the Serverless GPU Python API.
Serverless GPU compute sweeps and distributed training notebook Fine-tune Qwen2-0.5B modelâThe following notebook provides an example of how to efficiently fine-tune the Qwen2-0.5B model using:
The following notebook provides an example of how to fine-tune an embedding model. This example uses contrastive learning to fine-tune an embedding model, gte-large-en-v1.5
on a single A10G.
This notebook demonstrates how to fine-tune Llama-3.2-3B using the Unsloth library.
Fine-tune Llama model with Unsloth notebook Object detection custom fine-tuningâThis notebook demonstrates how to train an object detection model using a Hugging Face example on one A10 GPU.
Object detection custom fine-tuning notebook XGBoost model trainingâThis notebook demonstrates how to train an XGBoost regression model on a single GPU.
XGBoost model training notebook Two-tower recommendation modelâThese notebooks demonstrate how to convert your recommendation data into MDS format and then use that data to create a two-tower recommendation model.
This notebook demonstrates how to use Databricks Serverless GPU to run supervised fine-tuning (SFT) using the TRL library with DeepSpeed ZeRO Stage 3 optimization on a single node A10 GPU.
Distributed TRL SFT Training notebook Time series forecasting with GluonTSâThis notebook demonstrates an end-to-end workflow for probabilistic time-series forecasting of electricity-consumption data with GluonTSâs DeepAR model on a serverless GPU cluster, covering data ingestion, resampling, model training, prediction, visualization, and evaluation.
Time series forecasting with GluonTS notebookRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4