A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://contrib.scikit-learn.org/category_encoders/ below:

Website Navigation


Category Encoders — Category Encoders 2.8.1 documentation

Category Encoders

A set of scikit-learn-style transformers for encoding categorical variables into numeric with different techniques. While ordinal, one-hot, and hashing encoders have similar equivalents in the existing scikit-learn version, the transformers in this library all share a few useful properties:

(*) For full compatibility with Pipelines and ColumnTransformers, and consistent behaviour of get_feature_names_out, it’s recommended to upgrade sklearn to a version at least ‘1.2.0’ and to set output as pandas:

import sklearn
sklearn.set_config(transform_output="pandas")
Usage

install as:

pip install category_encoders

or

conda install -c conda-forge category_encoders

To use:

import category_encoders as ce

encoder = ce.BackwardDifferenceEncoder(cols=[...])
encoder = ce.BaseNEncoder(cols=[...])
encoder = ce.BinaryEncoder(cols=[...])
encoder = ce.CatBoostEncoder(cols=[...])
encoder = ce.CountEncoder(cols=[...])
encoder = ce.GLMMEncoder(cols=[...])
encoder = ce.GrayEncoder(cols=[...])
encoder = ce.HashingEncoder(cols=[...])
encoder = ce.HelmertEncoder(cols=[...])
encoder = ce.JamesSteinEncoder(cols=[...])
encoder = ce.LeaveOneOutEncoder(cols=[...])
encoder = ce.MEstimateEncoder(cols=[...])
encoder = ce.OneHotEncoder(cols=[...])
encoder = ce.OrdinalEncoder(cols=[...])
encoder = ce.PolynomialEncoder(cols=[...])
encoder = ce.QuantileEncoder(cols=[...])
encoder = ce.RankHotEncoder(cols=[...])
encoder = ce.SumEncoder(cols=[...])
encoder = ce.TargetEncoder(cols=[...])
encoder = ce.WOEEncoder(cols=[...])

encoder.fit(X, y)
X_cleaned = encoder.transform(X_dirty)

All of these are fully compatible sklearn transformers, so they can be used in pipelines or in your existing scripts. If the cols parameter isn’t passed, every non-numeric column will be converted. See below for detailed documentation

Known issues:

CategoryEncoders internally works with pandas DataFrames as opposed to sklearn which works with numpy arrays. This can cause problems in sklearn versions prior to 1.2.0. In order to ensure full compatibility with sklearn set sklearn to also output DataFrames. This can be done by

sklearn.set_config(transform_output="pandas")

for a whole project or just for a single pipeline using

Pipeline(
    steps=[
        ("preprocessor", SomePreprocessor().set_output("pandas"),
        ("encoder", SomeEncoder()),
    ]
)

If you experience another bug, feel free to report it on [github](https://github.com/scikit-learn-contrib/category_encoders/issues)

Contents: Indices and tables

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4