Read Stata file into DataFrame.
Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.dta
.
If you want to pass in a path object, pandas accepts any os.PathLike
.
By file-like object, we refer to objects with a read()
method, such as a file handle (e.g. via builtin open
function) or StringIO
.
Convert date variables to DataFrame time values.
Read value labels and convert columns to Categorical/Factor variables.
Column to set as index.
Flag indicating whether to convert missing values to their Stata representations. If False, missing values are replaced with nan. If True, columns containing missing values are returned with object data types and missing values are represented by StataMissingValue objects.
Preserve Stata datatypes. If False, numeric data are upcast to pandas default types for foreign data (float64 or int64).
Columns to retain. Columns will be returned in the given order. None returns all columns.
Flag indicating whether converted categorical data are ordered.
Return StataReader object for iterations, returns chunks with given number of lines.
Return StataReader object.
For on-the-fly decompression of on-disk data. If âinferâ and âfilepath_or_bufferâ is path-like, then detect compression from the following extensions: â.gzâ, â.bz2â, â.zipâ, â.xzâ, â.zstâ, â.tarâ, â.tar.gzâ, â.tar.xzâ or â.tar.bz2â (otherwise no compression). If using âzipâ or âtarâ, the ZIP file must contain only one data file to be read in. Set to None
for no decompression. Can also be a dict with key 'method'
set to one of {'zip'
, 'gzip'
, 'bz2'
, 'zstd'
, 'xz'
, 'tar'
} and other key-value pairs are forwarded to zipfile.ZipFile
, gzip.GzipFile
, bz2.BZ2File
, zstandard.ZstdDecompressor
, lzma.LZMAFile
or tarfile.TarFile
, respectively. As an example, the following could be passed for Zstandard decompression using a custom compression dictionary: compression={'method': 'zstd', 'dict_data': my_compression_dict}
.
Added in version 1.5.0: Added support for .tar files.
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request
as header options. For other URLs (e.g. starting with âs3://â, and âgcs://â) the key-value pairs are forwarded to fsspec.open
. Please see fsspec
and urllib
for more details, and for more examples on storage options refer here.
See also
io.stata.StataReader
Low-level reader for Stata data files.
DataFrame.to_stata
Export Stata data files.
Notes
Categorical variables read through an iterator may not have the same categories and dtype. This occurs when a variable stored in a DTA file is associated to an incomplete set of value labels that only label a strict subset of the values.
Examples
Creating a dummy stata for this example
>>> df = pd.DataFrame({'animal': ['falcon', 'parrot', 'falcon', 'parrot'], ... 'speed': [350, 18, 361, 15]}) >>> df.to_stata('animals.dta')
Read a Stata dta file:
>>> df = pd.read_stata('animals.dta')
Read a Stata dta file in 10,000 line chunks:
>>> values = np.random.randint(0, 10, size=(20_000, 1), dtype="uint8") >>> df = pd.DataFrame(values, columns=["i"]) >>> df.to_stata('filename.dta')
>>> with pd.read_stata('filename.dta', chunksize=10000) as itr: >>> for chunk in itr: ... # Operate on a single chunk, e.g., chunk.mean() ... pass
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4