A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/45350 below:

No error raised when write dta files using `to_stata()` if the column contains `-np.inf` · Issue #45350 · pandas-dev/pandas · GitHub

Pandas version checks Reproducible Example
import pandas as pd
import numpy as np
pd.DataFrame({'x': [1, np.inf]}).to_stata('test.dta')
pd.DataFrame({'x': [1, -np.inf]}).to_stata('test.dta')
Issue Description

If any column contains np.inf, then an error is raised when trying to write the DataFrame to dta file:

ValueError: Column x has a maximum value of infinity which is outside the range supported by Stata.

However, if the column contains -np.inf instead, no error is raised, but when opening the dta file (test.dta), -np.inf is displayed as -1.#INF in Stata:

     +-----------------+
     | index         x |
     |-----------------|
  1. |     0         1 |
  2. |     1   -1.#INF |
     +-----------------+
Expected Behavior

I expect the same error is raised if any column contains either np.inf or -np.inf.

It seems to be related to the following lines:

value = data[col].max() if np.isinf(value): raise ValueError( f"Column {col} has a maximum value of infinity which is outside " "the range supported by Stata." )

When checking for infinity, line 607 pick up the maximum value of the column, which will be

np.inf

if it exists. However, it will not pick up

-np.inf

, which is the minimum value of the column. Hence, an easy fix would be replacing the quoted lines with

maxvalue = data[col].max() 
minvalue = data[col].min()
if np.isinf(maxvalue) or np.isinf(minvalue): 
    raise ValueError( 
        f"Column {col} has a maximum value of infinity (or a minimum value of -infinity) which is outside " 
        "the range supported by Stata." 
    ) 
Installed Versions INSTALLED VERSIONS

commit : 7c48ff4
python : 3.8.12.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19043
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.utf8

pandas : 1.2.5
numpy : 1.20.2
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.3
setuptools : 52.0.0.post20210125
Cython : None
pytest : 6.2.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : 2.9.1 (dt dec pq3 ext lo64)
jinja2 : 3.0.1
IPython : 7.22.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : 2021.06.0
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : None
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.2
sqlalchemy : 1.4.22
tables : None
tabulate : 0.8.9
xarray : None
xlrd : 2.0.1
xlwt : None
numba : 0.53.1


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4