This article demonstrates how to train a model with AutoML using the AutoML Python API. See AutoML Python API reference for more details.
The API provides functions to start classification, regression, and forecasting AutoML runs. Each function call trains a set of models and generates a trial notebook for each model.
See Requirements for AutoML experiments.
Setup an experiment using the AutoML APIâThe following steps generally describe how to set up an AutoML experiment using the API:
Create a notebook and attach it to a cluster running Databricks Runtime ML.
Identify which table you want to use from your existing data source or upload a data file to DBFS and create a table.
To start an AutoML run, use the automl.regress()
, automl.classify()
, or automl.forecast()
function and pass the table, along with any other training parameters. To see all functions and parameters, see AutoML Python API reference.
For example:
Python
summary = automl.regress(dataset=train_pdf, target_col="col_to_predict")
summary = automl.classification(dataset=train_pdf, target_col="col_to_predict")
summary = automl.forecast(dataset=train_pdf, target_col="col_to_predic", time_col="date_col", horizon=horizon, frequency="d", output_database="default")
When the AutoML run begins, an MLflow experiment URL appears in the console. Use this URL to monitor the run's progress. Refresh the MLflow experiment to see the trials as they are completed.
After the AutoML run completes:
To import a notebook saved as an MLflow artifact, use the databricks.automl.import_notebook
Python API. For more information see Import notebook
You can register and deploy your AutoML-trained model just like any registered model in the MLflow model registry; see Log, load, and register MLflow models.
No module namedpandas.core.indexes.numeric
â
When serving a model built using AutoML with Model Serving, you may get the error: No module named pandas.core.indexes.numeric
.
This is due to an incompatible pandas
version between AutoML and the model serving endpoint environment. To resolve the error:
requirements.txt
and conda.yaml
for your logged model to include the appropriate pandas
dependency version: pandas==1.5.3
.run_id
of the MLflow run where your model was logged.The following notebook shows how to do regression with AutoML.
AutoML regression example notebook Next stepsâRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4