RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://scikit-learn.org/dev/modules/../auto_examples/miscellaneous/plot_pipeline_display.html below:

Displaying Pipelines — scikit-learn 1.8.dev0 documentation

Note

Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder

Displaying Pipelines#

The default configuration for displaying a pipeline in a Jupyter Notebook is 'diagram' where set_config(display='diagram'). To deactivate HTML representation, use set_config(display='text').

To see more detailed steps in the visualization of the pipeline, click on the steps in the pipeline.

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

Displaying a Pipeline with a Preprocessing Step and Classifier#

This section constructs a Pipeline with a preprocessing step, StandardScaler, and classifier, LogisticRegression, and displays its visual representation.

To visualize the diagram, the default is display='diagram'.

set_config(display="diagram")
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('preprocessing', StandardScaler()),
                ('classifier', LogisticRegression())])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org. Parameters steps [('preprocessing', ...), ('classifier', ...)] transform_input None memory None verbose False Parameters copy True with_mean True with_std True Parameters penalty 'l2' dual False tol 0.0001 C 1.0 fit_intercept True intercept_scaling 1 class_weight None random_state None solver 'lbfgs' max_iter 100 multi_class 'deprecated' verbose 0 warm_start False n_jobs None l1_ratio None

To view the text pipeline, change to display='text'.

Pipeline(steps=[('preprocessing', StandardScaler()),
                ('classifier', LogisticRegression())])

Put back the default display

Displaying a Pipeline Chaining Multiple Preprocessing Steps & Classifier#

This section constructs a Pipeline with multiple preprocessing steps, PolynomialFeatures and StandardScaler, and a classifier step, LogisticRegression, and displays its visual representation.

Pipeline(steps=[('standard_scaler', StandardScaler()),
                ('polynomial', PolynomialFeatures(degree=3)),
                ('classifier', LogisticRegression(C=2.0))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org. Parameters steps [('standard_scaler', ...), ('polynomial', ...), ...] transform_input None memory None verbose False Parameters copy True with_mean True with_std True Parameters degree 3 interaction_only False include_bias True order 'C' Parameters penalty 'l2' dual False tol 0.0001 C 2.0 fit_intercept True intercept_scaling 1 class_weight None random_state None solver 'lbfgs' max_iter 100 multi_class 'deprecated' verbose 0 warm_start False n_jobs None l1_ratio None Displaying a Pipeline and Dimensionality Reduction and Classifier#

This section constructs a Pipeline with a dimensionality reduction step, PCA, a classifier, SVC, and displays its visual representation.

from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC

steps = [("reduce_dim", PCA(n_components=4)), ("classifier", SVC(kernel="linear"))]
pipe = Pipeline(steps)
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('reduce_dim', PCA(n_components=4)),
                ('classifier', SVC(kernel='linear'))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org. Parameters steps [('reduce_dim', ...), ('classifier', ...)] transform_input None memory None verbose False Parameters n_components 4 copy True whiten False svd_solver 'auto' tol 0.0 iterated_power 'auto' n_oversamples 10 power_iteration_normalizer 'auto' random_state None Parameters C 1.0 kernel 'linear' degree 3 gamma 'scale' coef0 0.0 shrinking True probability False tol 0.001 cache_size 200 class_weight None verbose False max_iter -1 decision_function_shape 'ovr' break_ties False random_state None Displaying a Complex Pipeline Chaining a Column Transformer#

This section constructs a complex Pipeline with a ColumnTransformer and a classifier, LogisticRegression, and displays its visual representation.

import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler

numeric_preprocessor = Pipeline(
    steps=[
        ("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
        ("scaler", StandardScaler()),
    ]
)

categorical_preprocessor = Pipeline(
    steps=[
        (
            "imputation_constant",
            SimpleImputer(fill_value="missing", strategy="constant"),
        ),
        ("onehot", OneHotEncoder(handle_unknown="ignore")),
    ]
)

preprocessor = ColumnTransformer(
    [
        ("categorical", categorical_preprocessor, ["state", "gender"]),
        ("numerical", numeric_preprocessor, ["age", "weight"]),
    ]
)

pipe = make_pipeline(preprocessor, LogisticRegression(max_iter=500))
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('columntransformer',
                 ColumnTransformer(transformers=[('categorical',
                                                  Pipeline(steps=[('imputation_constant',
                                                                   SimpleImputer(fill_value='missing',
                                                                                 strategy='constant')),
                                                                  ('onehot',
                                                                   OneHotEncoder(handle_unknown='ignore'))]),
                                                  ['state', 'gender']),
                                                 ('numerical',
                                                  Pipeline(steps=[('imputation_mean',
                                                                   SimpleImputer()),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['age', 'weight'])])),
                ('logisticregression', LogisticRegression(max_iter=500))])

columntransformer: ColumnTransformer

Parameters transformers [('categorical', ...), ('numerical', ...)] remainder 'drop' sparse_threshold 0.3 n_jobs None transformer_weights None verbose False verbose_feature_names_out True force_int_remainder_cols 'deprecated' Parameters missing_values nan strategy 'constant' fill_value 'missing' copy True add_indicator False keep_empty_features False Parameters categories 'auto' drop None sparse_output True dtype <class 'numpy.float64'> handle_unknown 'ignore' min_frequency None max_categories None feature_name_combiner 'concat' Parameters missing_values nan strategy 'mean' fill_value None copy True add_indicator False keep_empty_features False Parameters copy True with_mean True with_std True Parameters penalty 'l2' dual False tol 0.0001 C 1.0 fit_intercept True intercept_scaling 1 class_weight None random_state None solver 'lbfgs' max_iter 500 multi_class 'deprecated' verbose 0 warm_start False n_jobs None l1_ratio None Displaying a Grid Search over a Pipeline with a Classifier#

This section constructs a GridSearchCV over a Pipeline with RandomForestClassifier and displays its visual representation.

import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler

numeric_preprocessor = Pipeline(
    steps=[
        ("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
        ("scaler", StandardScaler()),
    ]
)

categorical_preprocessor = Pipeline(
    steps=[
        (
            "imputation_constant",
            SimpleImputer(fill_value="missing", strategy="constant"),
        ),
        ("onehot", OneHotEncoder(handle_unknown="ignore")),
    ]
)

preprocessor = ColumnTransformer(
    [
        ("categorical", categorical_preprocessor, ["state", "gender"]),
        ("numerical", numeric_preprocessor, ["age", "weight"]),
    ]
)

pipe = Pipeline(
    steps=[("preprocessor", preprocessor), ("classifier", RandomForestClassifier())]
)

param_grid = {
    "classifier__n_estimators": [200, 500],
    "classifier__max_features": ["auto", "sqrt", "log2"],
    "classifier__max_depth": [4, 5, 6, 7, 8],
    "classifier__criterion": ["gini", "entropy"],
}

grid_search = GridSearchCV(pipe, param_grid=param_grid, n_jobs=1)
grid_search  # click on the diagram below to see the details of each step

GridSearchCV(estimator=Pipeline(steps=[('preprocessor',
                                        ColumnTransformer(transformers=[('categorical',
                                                                         Pipeline(steps=[('imputation_constant',
                                                                                          SimpleImputer(fill_value='missing',
                                                                                                        strategy='constant')),
                                                                                         ('onehot',
                                                                                          OneHotEncoder(handle_unknown='ignore'))]),
                                                                         ['state',
                                                                          'gender']),
                                                                        ('numerical',
                                                                         Pipeline(steps=[('imputation_mean',
                                                                                          SimpleImputer()),
                                                                                         ('scaler',
                                                                                          StandardScaler())]),
                                                                         ['age',
                                                                          'weight'])])),
                                       ('classifier',
                                        RandomForestClassifier())]),
             n_jobs=1,
             param_grid={'classifier__criterion': ['gini', 'entropy'],
                         'classifier__max_depth': [4, 5, 6, 7, 8],
                         'classifier__max_features': ['auto', 'sqrt', 'log2'],
                         'classifier__n_estimators': [200, 500]})

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org. Parameters estimator Pipeline(step...lassifier())]) param_grid {'classifier__criterion': ['gini', 'entropy'], 'classifier__max_depth': [4, 5, ...], 'classifier__max_features': ['auto', 'sqrt', ...], 'classifier__n_estimators': [200, 500]} scoring None n_jobs 1 refit True cv None verbose 0 pre_dispatch '2*n_jobs' error_score nan return_train_score False

preprocessor: ColumnTransformer

Parameters transformers [('categorical', ...), ('numerical', ...)] remainder 'drop' sparse_threshold 0.3 n_jobs None transformer_weights None verbose False verbose_feature_names_out True force_int_remainder_cols 'deprecated' Parameters missing_values nan strategy 'constant' fill_value 'missing' copy True add_indicator False keep_empty_features False Parameters categories 'auto' drop None sparse_output True dtype <class 'numpy.float64'> handle_unknown 'ignore' min_frequency None max_categories None feature_name_combiner 'concat' Parameters missing_values nan strategy 'mean' fill_value None copy True add_indicator False keep_empty_features False Parameters copy True with_mean True with_std True Parameters n_estimators 100 criterion 'gini' max_depth None min_samples_split 2 min_samples_leaf 1 min_weight_fraction_leaf 0.0 max_features 'sqrt' max_leaf_nodes None min_impurity_decrease 0.0 bootstrap True oob_score False n_jobs None random_state None verbose 0 warm_start False class_weight None ccp_alpha 0.0 max_samples None monotonic_cst None

Total running time of the script: (0 minutes 0.124 seconds)

Related examples

Gallery generated by Sphinx-Gallery

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4