A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/6697 below:

col.replace(dict) takes too much memory · Issue #6697 · pandas-dev/pandas · GitHub

I've code which run fine about ~10 month ago and now fails due to running out of memory. The code basically does this:

groups = [[1,2,3][222,333,444, 555], ...]  # from a 1.4MB json file
# about 30k such groups, results in ~70k replacements
for replace_list in groups:
        replacement = sorted(replace_list)[0]
        for replace_id in replace_list:
            if replace_id == replacement:
                continue
            replace_dict[replace_id] = replacement
# len(_col) == 974757
_col = df[column].astype("int64")
_col.replace(replace_dict, inplace=True)

I've now split the replacement_dict into 2k chunks and and this works (takes about 20 seconds for each chunk).


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4