2022-11-08 less than 1 minute read
Video Upcoming feature in release 1.2Starting with the next release of scikit-learn (v1.2), pandas dataframe output will be available for all sklearn transformers! This will make running pipelines on dataframes much easier and provide better ways to track feature names. Previously, mapping a transformed output back into columns would be cumbersome as it might not be a one-to-one mapping in cases of complex preprocessing (e.g., polynomial features).
The pandas dataframe output feature for transformers solves this by tracking features generated from pipelines automatically. The transformer output format can be configured explictly for either numpy or pandas output formats as shown in sklearn.set_config and the sample code below.
from sklearn import set_config
set_config(transform_output = "pandas")
See the sample notebook, pandas-dataframe-output-for-sklearn-transformer.ipynb and documentation for a more detailed example and usage.
Links to documentation and example notebook Reporting bugsWe’d love your feedback on this. In case of any suggestions or bugs, please report them at scikit-learn issues
Thanks 🙏🏾 to maintainers: Thomas J. Fan, Guillaume Lemaitre , Christian Lorentzen !!
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4