I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
In [1]: import pandas as pd In [2]: df = pd.DataFrame(columns=['A', 'B', 'C']) In [3]: df.groupby('A').B.describe() --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-3-7259c86156be> in <module> ----> 1 df.groupby('A').B.describe() ~/.../python3.8/site-packages/pandas/core/groupby/generic.py in describe(self, **kwargs) 674 if self.axis == 1: 675 return result.T --> 676 return result.unstack() 677 678 def value_counts( ~/.../python3.8/site-packages/pandas/core/series.py in unstack(self, level, fill_value) 3827 from pandas.core.reshape.reshape import unstack 3828 -> 3829 return unstack(self, level, fill_value) 3830 3831 # ---------------------------------------------------------------------- ~/.../python3.8/site-packages/pandas/core/reshape/reshape.py in unstack(obj, level, fill_value) 422 # Give nicer error messages when unstack a Series whose 423 # Index is not a MultiIndex. --> 424 raise ValueError( 425 f"index must be a MultiIndex to unstack, {type(obj.index)} was passed" 426 ) ValueError: index must be a MultiIndex to unstack, <class 'pandas.core.indexes.base.Index'> was passed In [4]: df.groupby('A').describe() Out[4]: Series([], dtype: float64)Problem description
SeriesGroupBy.describe
raises an error when called on an empty dataset, and DataframeGroupBy.describe
succeeds, but returns an empty Series
.
I would expect both of these to return an empty DataFrame
with the appropriate columns.
In [3]: df.groupby('A').B.describe() Out [3]: Empty DataFrame Columns: [count, mean, std, min, 25%, 50%, 75%, max] Index: [] In [4]: df.groupby('A').describe() Out [4]: Empty DataFrame Columns: [(B, count), (B, mean), (B, std), (B, min), (B, 25%), (B, 50%), (B, 75%), (B, max)(C, count), (C, mean), (C, std), (C, min), (C, 25%), (C, 50%), (C, 75%), (C, max)] Index: []Output of
pd.show_versions()
INSTALLED VERSIONS ------------------ commit : 2cb9652 python : 3.8.6.final.0 python-bits : 64 OS : Linux OS-release : 5.10.26-1rodete1-amd64 Version : #1 SMP Debian 5.10.26-1rodete1 (2021-04-12) machine : x86_64 processor : byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 1.2.4
numpy : 1.19.5
pytz : 2019.3
dateutil : 2.8.1
pip : 20.2.1
setuptools : 49.2.1
Cython : 0.29.13
pytest : 4.6.11
hypothesis : None
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.2
IPython : 7.19.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 3.0.0
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.20
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4