Convert categorical variable into dummy/indicator variables.
Each variable is converted in as many 0/1 variables as there are different values. Columns in the output are each named after a value; if the input is a DataFrame, the name of the original variable is prepended to the value.
Data of which to get dummy indicators.
A string to be prepended to DataFrame column names. Pass a list with length equal to the number of columns when calling get_dummies on a DataFrame. Alternatively, prefix can be a dictionary mapping column names to prefixes.
Should you choose to prepend DataFrame column names with a prefix, this is the separator/delimiter to use between the two. Alternatively, prefix_sep can be a list with length equal to the number of columns, or a dictionary mapping column names to separators.
If True, a NaN indicator column will be added even if no NaN values are present. If False, NA values are encoded as all zero.
Column names in the DataFrame to be encoded. If columns is None then all the columns with object, string, or category dtype will be converted.
Whether the dummy-encoded columns should be backed by a SparseArray
(True) or a regular NumPy array (False).
Whether to get k-1 dummies out of k categorical levels by removing the first level.
Data type for new columns. Only a single dtype is allowed.
Dummy-coded data. If data contains other columns than the dummy-coded one(s), these will be prepended, unaltered, to the result.
Notes
Reference the user guide for more examples.
Examples
>>> s = pd.Series(list("abca"))
>>> pd.get_dummies(s) a b c 0 True False False 1 False True False 2 False False True 3 True False False
>>> s1 = ["a", "b", np.nan]
>>> pd.get_dummies(s1) a b 0 True False 1 False True 2 False False
>>> pd.get_dummies(s1, dummy_na=True) a b NaN 0 True False False 1 False True False 2 False False True
>>> df = pd.DataFrame({"A": ["a", "b", "a"], "B": ["b", "a", "c"], "C": [1, 2, 3]})
>>> pd.get_dummies(df, prefix=["col1", "col2"]) C col1_a col1_b col2_a col2_b col2_c 0 1 True False False True False 1 2 False True True False False 2 3 True False False False True
>>> pd.get_dummies(pd.Series(list("abcaa"))) a b c 0 True False False 1 False True False 2 False False True 3 True False False 4 True False False
>>> pd.get_dummies(pd.Series(list("abcaa")), drop_first=True) b c 0 False False 1 True False 2 False True 3 False False 4 False False
>>> pd.get_dummies(pd.Series(list("abc")), dtype=float) a b c 0 1.0 0.0 0.0 1 0.0 1.0 0.0 2 0.0 0.0 1.0
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4