Create a categorical DataFrame
from a DataFrame
of dummy variables.
Inverts the operation performed by get_dummies()
.
Added in version 1.5.0.
Data which contains dummy-coded variables in form of integer columns of 1âs and 0âs.
Separator used in the column names of the dummy categories they are character indicating the separation of the categorical names from the prefixes. For example, if your column names are âprefix_Aâ and âprefix_Bâ, you can strip the underscore by specifying sep=â_â.
The default category is the implied category when a value has none of the listed categories specified with a one, i.e. if all dummies in a row are zero. Can be a single value for all variables or a dict directly mapping the default categories to a prefix of a variable.
Categorical data decoded from the dummy input-data.
When the input DataFrame
data
contains NA values.
When the input DataFrame
data
contains column names with separators that do not match the separator specified with sep
.
When a dict
passed to default_category
does not include an implied category for each prefix.
When a value in data
has more than one category assigned to it.
When default_category=None
and a value in data
has no category assigned to it.
When the input data
is not of type DataFrame
.
When the input DataFrame
data
contains non-dummy data.
When the passed sep
is of a wrong data type.
When the passed default_category
is of a wrong data type.
See also
get_dummies()
Convert Series
or DataFrame
to dummy codes.
Categorical
Represent a categorical variable in classic.
Notes
The columns of the passed dummy data should only include 1âs and 0âs, or boolean values.
Examples
>>> df = pd.DataFrame({"a": [1, 0, 0, 1], "b": [0, 1, 0, 0], "c": [0, 0, 1, 0]})
>>> df a b c 0 1 0 0 1 0 1 0 2 0 0 1 3 1 0 0
>>> pd.from_dummies(df) 0 a 1 b 2 c 3 a
>>> df = pd.DataFrame( ... { ... "col1_a": [1, 0, 1], ... "col1_b": [0, 1, 0], ... "col2_a": [0, 1, 0], ... "col2_b": [1, 0, 0], ... "col2_c": [0, 0, 1], ... } ... )
>>> df col1_a col1_b col2_a col2_b col2_c 0 1 0 0 1 0 1 0 1 1 0 0 2 1 0 0 0 1
>>> pd.from_dummies(df, sep="_") col1 col2 0 a b 1 b a 2 a c
>>> df = pd.DataFrame( ... { ... "col1_a": [1, 0, 0], ... "col1_b": [0, 1, 0], ... "col2_a": [0, 1, 0], ... "col2_b": [1, 0, 0], ... "col2_c": [0, 0, 0], ... } ... )
>>> df col1_a col1_b col2_a col2_b col2_c 0 1 0 0 1 0 1 0 1 1 0 0 2 0 0 0 0 0
>>> pd.from_dummies(df, sep="_", default_category={"col1": "d", "col2": "e"}) col1 col2 0 a b 1 b a 2 d e
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4