When exploring a data set, I often need to df.apply(pd.Series.nunique)
or df.apply(lambda x: x.nunique())
. How about adding this as nunique()
-method parallel to DataFrame.count()
(count
and unique
are also the two most basic infos displayed by DataFrame.describe()
)?
I think there are also use cases for this as a groupby
-method, for example when checking a candidate primary key for different lines (values):
>>> import pandas as pd >>> df = pd.DataFrame({'id': ['spam', 'eggs', 'eggs', 'spam'], 'value': [1, 5, 5, 2]}) >>> df.groupby('id').filter(lambda g: (g.apply(pd.Series.nunique) > 1).any()) id value 0 spam 1 3 spam 2
jorisvandenbossche, udaykumar156, sergylog and ostwalprasad
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4