A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/modin-project/modin/issues/6875 below:

`groupby(sort=True)` may produce unsorted results with range-partitioning implementation · Issue #6875 · modin-project/modin · GitHub

Modin version checks Reproducible Example
import numpy as np
import modin.pandas as pd
import pandas

from modin.pandas.test.utils import df_equals

import modin.config as cfg
cfg.IsDebug.put(True)
cfg.NPartitions.put(4)
cfg.RangePartitioningGroupby.put(True)

np.random.seed(214)

data = {
    "a": ["a", "b", "c", "d", "e", "b", "g", "a"] * 32,
    "b": [1, 2, 3, 4] * 64,
    "c": range(256),
    "d": range(256),
    "e": ["x", "y"] * 128,
}

filter = lambda row: (~row["a"].isin(["a", "e"]) & ~row["b"].isin([4]))

md_df, pd_df = pd.DataFrame(data), pandas.DataFrame(data)
md_df = md_df[filter]
pd_df = pd_df[filter]

md_res = md_df.groupby(["a", "e"]).sum()
pd_res = pd_df.groupby(["a", "e"]).sum()
df_equals(md_res, pd_res)
# MultiIndex level [0] values are different (100.0 %)
# [left]:  Index(['c', 'g', 'b'], dtype='object', name='a')
# [right]: Index(['b', 'c', 'g'], dtype='object', name='a')
# At positional index 0, first diff: c != b
Issue Description

The modin's result is unsorted. This seems to be only relevant for multi-column groupby

Expected Behavior

should be sorted

Error Logs
Replace this line with the error backtrace (if applicable).
Installed Versions

Replace this line with the output of pd.show_versions()


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4