A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/14983 below:

read_csv fails with uint64 · Issue #14983 · pandas-dev/pandas · GitHub

master at aba7d2:

>>> from pandas import read_csv
>>> from pandas.compat import StringIO
>>> data = 'a\n' + str(2**63)
>>>
>>> read_csv(StringIO(data), engine='c').info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 1 columns):
a    1 non-null object
dtypes: object(1)
memory usage: 88.0+ bytes
>>>
>>> read_csv(StringIO(data), engine='python').info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 1 columns):
a    1 non-null object
dtypes: object(1)
memory usage: 88.0+ bytes

We should be able to handle uint64, and tests like this one here should not be enforcing buggy behavior.

The buggy behavior for the C engine traces to here, where we attempt to cast according to this order defined here. Note for starters that uint64 is not in that list. This try-except is due to OverflowError with int64, after which we immediately convert to an object array of strings. At first, I thought inserting uint64 to the list would be good, but that can cause bad casting in the other direction, i.e. negative numbers get converted to their uint64 equivalents.

The buggy behavior for the Python engine traces to here, where we attempt to infer the dtype here. However, as I pointed out in #14982, this function fails with uint64 with a similar (and non-sensical) try-except for OverflowError in int64.

The questions that I posed in #14982 are also relevant here, since they should be consistent across both engines that also is performant. Patching the Python engine probably requires fixing #14982 first, and patching the C engine probably requires adding new functions to parser.pyx to parse uint64 and tokenizer.c. However, in light of the questions that I posed in #14982, I'm not really sure what is best.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4