A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/41574 below:

support defaultdict in read_csv dtype parameter · Issue #41574 · pandas-dev/pandas · GitHub

I have a large csv file with 15k columns with non-obvious names, of which 14990 are floating point numbers. I'd like to load them as floats without read_csv having to spend the time divining their types.

dtype allows providing a dict, but making one with all the column names is tedious and not always possible. The obvious solution is to provide a defaultdict, with a default of np.float32, and including entries for the other column types. Unfortunately currently, the default is silently ignored by read_csv. Presumably read_csv is not directly querying the dictionary, but rather checking first whether an item is there.

If this is not possible, it would be helpful to include a warning to the user, or at least some mention in the documentation, that defaultdict is not supported. It took me a long time to figure out why my floats weren't being treated as float32 and why read_csv was still trying to determine the types of these columns.

samukweku, pablodz and eangius


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4