A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/11600 below:

sparse dataframes lose multi-index column names · Issue #11600 · pandas-dev/pandas · GitHub

From SO: http://stackoverflow.com/questions/33702198/do-python-pandas-sparse-dataframes-lose-multi-index-column-names-or-am-i-doing-i

Bug is simple in concept, multi-index with column level names loses those names when going into sparse dataframes.

Minimal example - first create a multi-index dataframe:

In[2]: import pandas as pd
In[3]: miindex = pd.MultiIndex.from_product([["x","y"], ["10","20"]],names=['row-foo', 'row-bar'])
micol = pd.MultiIndex.from_product([['a','b','c'], ["1","2"]],names=['col-foo', 'col-bar'])
df = pd.DataFrame(index=miindex, columns=micol).sortlevel().sortlevel(axis=1)
df = df.fillna(value=3.14)
df
Out[3]: 
col-foo             a           b           c      
col-bar             1     2     1     2     1     2
row-foo row-bar                                    
x       10       3.14  3.14  3.14  3.14  3.14  3.14
        20       3.14  3.14  3.14  3.14  3.14  3.14
y       10       3.14  3.14  3.14  3.14  3.14  3.14
        20       3.14  3.14  3.14  3.14  3.14  3.14

This gives us a nice test multi-index with column and row level names. Now if I make a sparse matrix out of that and show it, the column level names are gone.

In[4]: ds = df.to_sparse()
ds
Out[4]: 
                    a           b           c      
                    1     2     1     2     1     2
row-foo row-bar                                    
x       10       3.14  3.14  3.14  3.14  3.14  3.14
        20       3.14  3.14  3.14  3.14  3.14  3.14
y       10       3.14  3.14  3.14  3.14  3.14  3.14
        20       3.14  3.14  3.14  3.14  3.14  3.14

And if I convert the sparse version back to dense those level names are still gone.

In[6]: ds.to_dense()
Out[6]: 
                    a           b           c      
                    1     2     1     2     1     2
row-foo row-bar                                    
x       10       3.14  3.14  3.14  3.14  3.14  3.14
        20       3.14  3.14  3.14  3.14  3.14  3.14
y       10       3.14  3.14  3.14  3.14  3.14  3.14
        20       3.14  3.14  3.14  3.14  3.14  3.14

I AM aware that displaying the sparse version calls to_dense() but the loss appears to be happening at the conversion to sparse. I'm exploring moving to sparse to reduce memory usage for a code base and my attempts to access the levels within the sparse dataframe generate "KeyError: 'Level not found'"


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4