python -c "import pandas as pd; df = pd.DataFrame(dict(a=[1, 1, 2, 3], b=[1, 2, 3, 4])); print(df.nsmallest(2, 'a'))"
Problem description
When using nlargest/nsmallest and the n largest / smallest values are identical, the method seems to return the dataframe concatenated with the filtered version of itself.
Furthermore if all values are identical, you get the full dataframe concatenated with itself, regardless of the choice of n
Not really sure, I guess in the example above you should simply get a dataframe that looks like thispd.DataFrame(dict(a=[1, 1], b=[1, 2]))
however if you were to have df = pd.DataFrame(dict(a=[1, 1, 1, 1], b=[1, 2, 3, 4]))
and asked fordf.nlargest(2, 'a')
you should again getpd.DataFrame(dict(a=[1, 1], b=[1, 2]))
pd.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.8.0-34-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_IE.UTF-8 LOCALE: None.None
pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 28.3.0
Cython: 0.23.4
numpy: 1.12.0
scipy: 0.16.1
statsmodels: 0.6.1
xarray: None
IPython: None
sphinx: 1.3.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: 3.2.0
numexpr: 2.4.6
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: 0.9.2
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.38.0
pandas_datareader: None
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4