Minor issue regarding read_csv
's na_values
argument in dict
format. I note that the list format works fine when the NA value is given as a float-type (which is often the intuitive choice), e.g.:
co2 = pd.read_csv("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt", comment = "#", delim_whitespace = True, names = ["year", "month", "decimal_date", "average", "interpolated", "trend", "days"], na_values =[-99.99, -1])
However, the dict format is more appropriate for this classic data set, since different columns are defining different NA values. Unfortunately, this fails with an error about float type:
co2 = pd.read_csv("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt", comment = "#", delim_whitespace = True, names = ["year", "month", "decimal_date", "average", "interpolated", "trend", "days"], na_values = {"decimal_date" : -99.99, "days" : -1})
and the NA value must be given as a string; which feels all kinds of wrong here:
co2 = pd.read_csv("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt", comment = "#", delim_whitespace = True, names = ["year", "month", "decimal_date", "average", "interpolated", "trend", "days"], na_values = {"decimal_date" : "-99.99", "days" : "-1"})
Thanks for all the pandas
awesomeness,
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4