A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/5391 below:

left join fails in case of non-unique indices · Issue #5391 · pandas-dev/pandas · GitHub

It seems to me that join operation fails if the index is not of unique values. The particular circumastance that I observed this was with multi-index:

df1.set_index( [ 'col1', 'col2', 'col3' ], inplace=True )
df2.join ( df1, on=['cola', 'colb', 'colc' ], how='left' )

I understand that the above join operation is not well-defined for non-unique index values, but pandas gives wrong values even for unique matches. ( no warnings, error messages whatsoever )

In case checking for index integrity has a heavy performance cost, it should be documented that this method fails if the index is not unique. ( or alternatively have the optional argument to enforce integrity check )

I could get correct join by doing below:

df1.drop_duplicates( cols=[ 'col1', 'col2', 'col3' ], inplace=True )
df1.set_index( [ 'col1', 'col2', 'col3' ], inplace=True )
df2.join ( df1, on=['cola', 'colb', 'colc' ], how='left' )

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4