This article describes support for deploying a custom model using Mosaic AI Model Serving. It also provides details about supported model logging options and compute types, how to package model dependencies for serving, and endpoint creation and scaling.
What are custom models?âModel Serving can deploy any Python model as a production-grade API. Databricks refers to such models as custom models. These ML models can be trained using standard ML libraries like scikit-learn, XGBoost, PyTorch, and HuggingFace transformers and can include any Python code.
To deploy a custom model,
For a complete tutorial on how to serve custom models on Databricks, see Model serving tutorial.
Databricks also supports serving foundation models for generative AI applications, see Foundation Model APIs and External models for supported models and compute offerings.
important
If you rely on Anaconda, review the terms of service notice for additional information.
Log ML modelsâThere are different methods to log your ML model for model serving. The following list summarizes the supported methods and examples.
Autologging This method is automatically enabled when using Databricks Runtime for ML.
Python
import mlflow
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import load_iris
iris = load_iris()
model = RandomForestRegressor()
model.fit(iris.data, iris.target)
Log using MLflow's built-in flavors. You can use this method if you want to manually log the model for more detailed control.
Python
import mlflow
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
iris = load_iris()
model = RandomForestClassifier()
model.fit(iris.data, iris.target)
with mlflow.start_run():
mlflow.sklearn.log_model(model, "random_forest_classifier")
Custom logging with pyfunc
. You can use this method for deploying arbitrary python code models or deploying additional code alongside your model.
Python
import mlflow
import mlflow.pyfunc
class Model(mlflow.pyfunc.PythonModel):
def predict(self, context, model_input):
return model_input * 2
with mlflow.start_run():
mlflow.pyfunc.log_model("custom_model", python_model=Model())
Adding a signature and input example to MLflow is recommended. Signatures are necessary for logging models to the Unity Catalog.
The following is a signature example:
Python
from mlflow.models.signature import infer_signature
signature = infer_signature(training_data, model.predict(training_data))
mlflow.sklearn.log_model(model, "model", signature=signature)
The following is an input example:
Python
input_example = {"feature1": 0.5, "feature2": 3}
mlflow.sklearn.log_model(model, "model", input_example=input_example)
Compute typeâ
Mosaic AI Model Serving provides a variety of CPU and GPU options for deploying your model. When deploying with a GPU, you must make sure that your code is set up so that predictions are run on the GPU, using the methods provided by your framework. MLflow does this automatically for models logged with the PyTorch or Transformers flavors.
Deployment container and dependenciesâDuring deployment, a production-grade container is built and deployed as the endpoint. This container includes libraries automatically captured or specified in the MLflow model.
The model serving container doesn't contain pre-installed dependencies, which might lead to dependency errors if not all required dependencies are included in the model. When running into model deployment issues, Databricks recommends you test the model locally.
Package and code dependenciesâCustom or private libraries can be added to your deployment. See Use custom Python libraries with Model Serving.
For MLflow native flavor models, the necessary package dependencies are automatically captured.
For custom pyfunc
models, dependencies can be explicitly added.
You can add package dependencies using:
The pip_requirements
parameter:
Python
mlflow.sklearn.log_model(model, "sklearn-model", pip_requirements = ["scikit-learn", "numpy"])
The conda_env
parameter:
Python
conda_env = {
'channels': ['defaults'],
'dependencies': [
'python=3.7.0',
'scikit-learn=0.21.3'
],
'name': 'mlflow-env'
}
mlflow.sklearn.log_model(model, "sklearn-model", conda_env = conda_env)
To include additional requirements beyond what is automatically captured, use extra_pip_requirements
.
Python
mlflow.sklearn.log_model(model, "sklearn-model", extra_pip_requirements = ["sklearn_req"])
If you have code dependencies, these can be specified using code_path
.
Python
mlflow.sklearn.log_model(model, "sklearn-model", code_path=["path/to/helper_functions.py"],)
Dependency validationâ
Prior to deploying a custom MLflow model, it is beneficial to verify that the model is capable of being served. MLflow provides an API that allows for validation of the model artifact that both simulates the deployment environment and allows for testing of modified dependencies.
There are two pre-deployment validation APIs the MLflow Python API and the MLflow CLI.
You can specify the following using either of these APIs.
model_uri
of the model that is deployed to model serving.input_data
in the expected format for the mlflow.pyfunc.PyFuncModel.predict()
call of the model.input_path
that defines a file containing input data that will be loaded and used for the call to predict
.content_type
in csv
or json
format.output_path
to write the predictions to a file. If you omit this parameter, the predictions are printed to stdout
.env_manager
, that is used to build the the environment for serving:
virtualenv
. Recommended for serving validation.local
is available, but potentially error prone for serving validation. Generally used only for rapid debugging.install_mlflow
. This setting defaults to False
.pip_requirements_override
.For example:
Python
import mlflow
run_id = "..."
model_uri = f"runs:/{run_id}/model"
mlflow.models.predict(
model_uri=model_uri,
input_data={"col1": 34.2, "col2": 11.2, "col3": "green"},
content_type="json",
env_manager="virtualenv",
install_mlflow=False,
pip_requirements_override=["pillow==10.3.0", "scipy==1.13.0"],
)
Dependency updatesâ
If there are any issues with the dependencies specified with a logged model, you can update the requirements by using the MLflow CLI or mlflow.models.model.update_model_requirements()
in th MLflow Python API without having to log another model.
The following example shows how update the pip_requirements.txt
of a logged model in-place.
You can update existing definitions with specified package versions or add non-existent requirements to the pip_requirements.txt
file. This file is within the MLflow model artifact at the specified model_uri
location.
Python
from mlflow.models.model import update_model_requirements
update_model_requirements(
model_uri=model_uri,
operation="add",
requirement_list=["pillow==10.2.0", "scipy==1.12.0"],
)
Expectations and limitationsâ
The following sections describe known expectations and limitations for serving custom models using Model Serving.
Endpoint creation and update expectationsânote
The information in this section does not apply to endpoints that serve foundation models or external models.
Deploying a newly registered model version involves packaging the model and its model environment and provisioning the model endpoint itself. This process can take approximately 10 minutes.
Databricks performs a zero-downtime update of endpoints by keeping the existing endpoint configuration up until the new one becomes ready. Doing so reduces risk of interruption for endpoints that are in use.
If model computation takes longer than 120 seconds, requests will time out. If you believe your model computation will take longer than 120 seconds, reach out to your Databricks account team.
Databricks performs occasional zero-downtime system updates and maintenance on existing Model Serving endpoints. During maintenance, Databricks reloads models and marks an endpoint as Failed if a model fails to reload. Make sure your customized models are robust and are able to reload at any time.
Endpoint scaling expectationsânote
The information in this section does not apply to endpoints that serve foundation models or external models.
Serving endpoints automatically scale based on traffic and the capacity of provisioned concurrency units.
The following are limitations for serving endpoints with GPU workloads:
ap-southeast-1
.The following notice is for customers relying on Anaconda.
important
Anaconda Inc. updated their terms of service for anaconda.org channels. Based on the new terms of service you may require a commercial license if you rely on Anaconda's packaging and distribution. See Anaconda Commercial Edition FAQ for more information. Your use of any Anaconda channels is governed by their terms of service.
MLflow models logged before v1.18 (Databricks Runtime 8.3 ML or earlier) were by default logged with the conda defaults
channel (https://repo.anaconda.com/pkgs/) as a dependency. Because of this license change, Databricks has stopped the use of the defaults
channel for models logged using MLflow v1.18 and above. The default channel logged is now conda-forge
, which points at the community managed https://conda-forge.org/.
If you logged a model before MLflow v1.18 without excluding the defaults
channel from the conda environment for the model, that model may have a dependency on the defaults
channel that you may not have intended. To manually confirm whether a model has this dependency, you can examine channel
value in the conda.yaml
file that is packaged with the logged model. For example, a model's conda.yaml
with a defaults
channel dependency may look like this:
YAML
channels:
- defaults
dependencies:
- python=3.8.8
- pip
- pip:
- mlflow
- scikit-learn==0.23.2
- cloudpickle==1.6.0
name: mlflow-env
Because Databricks can not determine whether your use of the Anaconda repository to interact with your models is permitted under your relationship with Anaconda, Databricks is not forcing its customers to make any changes. If your use of the Anaconda.com repo through the use of Databricks is permitted under Anaconda's terms, you do not need to take any action.
If you would like to change the channel used in a model's environment, you can re-register the model to the model registry with a new conda.yaml
. You can do this by specifying the channel in the conda_env
parameter of log_model()
.
For more information on the log_model()
API, see the MLflow documentation for the model flavor you are working with, for example, log_model for scikit-learn.
For more information on conda.yaml
files, see the MLflow documentation.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4