A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://docs.databricks.com/aws/en/machine-learning/automl/ below:

What is AutoML? | Databricks Documentation

What is AutoML?

AutoML simplifies the process of applying machine learning to your datasets by automatically finding the best algorithm and hyperparameter configuration for you.

How does AutoML work?​

Provide your dataset and specify the type of machine learning problem, then AutoML does the following:

  1. Cleans and prepares your data.
  2. Orchestrates distributed model training and hyperparameter tuning across multiple algorithms.
  3. Finds the best model using open source evaluation algorithms from scikit-learn, xgboost, LightGBM, Prophet, and ARIMA.
  4. Presents the results. AutoML also generates source code notebooks for each trial, allowing you to review, reproduce, and modify the code as needed.

Get started with AutoML experiments through a low-code UI for regression; classification; or forecasting, or the Python API.

Requirements​ AutoML algorithms​

AutoML trains and evaluates models based on the algorithms in the following table.

note

For classification and regression models, the decision tree, random forests, logistic regression, and linear regression with stochastic gradient descent algorithms are based on scikit-learn.

Trial notebook generation​

Classic compute AutoML generates notebooks of the source code behind trials so you can review, reproduce, and modify the code as needed.

For forecasting experiments, AutoML-generated notebooks are automatically imported to your workspace for all trials of your experiment.

For classification and regression experiments, AutoML-generated notebooks for data exploration and the best trial in your experiment are automatically imported to your workspace. Generated notebooks for other experiment trials are saved as MLflow artifacts on DBFS instead of auto-imported into your workspace. For all trials besides the best trial, the notebook_path and notebook_url in the TrialInfo Python API are not set. If you need to use these notebooks, you can manually import them into your workspace with the AutoML experiment UI or the databricks.automl.import_notebook Python API.

If you only use the data exploration notebook or best trial notebook generated by AutoML, the Source column in the AutoML experiment UI contains the link to the generated notebook for the best trial.

If you use other generated notebooks in the AutoML experiment UI, these are not automatically imported into the workspace. You can find the notebooks by clicking into each MLflow run. The IPython notebook is saved in the Artifacts section of the run page. You can download this notebook and import it into the workspace, if downloading artifacts is enabled by your workspace administrators.

Shapley values (SHAP) for model explainability​

note

For MLR 11.1 and below, SHAP plots are not generated if the dataset contains a datetime column.

The notebooks produced by AutoML regression and classification runs include code to calculate Shapley values. Shapley values are based in game theory and estimate the importance of each feature to a model's predictions.

AutoML notebooks calculate Shapley values using the SHAP package. Because these calculations are highly memory-intensive, the calculations are not performed by default.

To calculate and display Shapley values:

  1. Go to the Feature importance section in an AutoML-generated trial notebook.
  2. Set shap_enabled = True.
  3. Re-run the notebook.
Next steps​

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4