import numpy as np import pandas as pd res = [] for _ in range(2): res1 = [] # Only occurs when dataframe is used with measure data = np.zeros((30, 21)) idx = np.random.randint(0, 5, 30) df = pd.DataFrame(data, index=idx).loc[3] #df = pd.DataFrame(data[::5, :]) # Uncomment for example of correct behavior res1.append(pd.DataFrame(sum(data.dot(df.T)))) tmp = pd.concat(res1, keys=[1], names=['level1']) res.append(tmp) final = pd.concat(res, keys=[i for i in range(2)], names=['level2']) print(final)Problem description
In python, datatypes generally don't matter. A dataframe is a dataframe, but as shown in the example code concat'ing dataframes with an index does not have the same behavior as dataframes without an index. The label for a level of the index is dropped. This is a small bug. Run it several times (10-12 seems to do it) and you will see a much more worrisome issue: on occasion, the label is not dropped. Yes, the output of concat is random.
Expected Outputlevel2 level1
0 1 0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
5 0.0
1 1 0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
5 0.0
Output of pd.show_versions()
level2
0 1 0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
5 0.0
1 1 0 0.0
1 0.0
2 0.0
3 0.0
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4