RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://pandas.pydata.org/pandas-docs/stable/reference/api/../api/pandas.DataFrame.to_parquet.html below:

pandas.DataFrame.to_parquet — pandas 2.3.1 documentation

pandas.DataFrame.to_parquet#

DataFrame.to_parquet(path=None, *, engine='auto', compression='snappy', index=None, partition_cols=None, storage_options=None, **kwargs)[source]#

Write a DataFrame to the binary parquet format.

This function writes the dataframe as a parquet file. You can choose different parquet backends, and have the option of compression. See the user guide for more details.

Parameters:

pathstr, path object, file-like object, or None, default None: String, path object (implementing os.PathLike[str]), or file-like object implementing a binary write() function. If None, the result is returned as bytes. If a string or path, it will be used as Root Directory path when writing a partitioned dataset.
engine{âautoâ, âpyarrowâ, âfastparquetâ}, default âautoâ: Parquet library to use. If âautoâ, then the option io.parquet.engine is used. The default io.parquet.engine behavior is to try âpyarrowâ, falling back to âfastparquetâ if âpyarrowâ is unavailable.
compressionstr or None, default âsnappyâ: Name of the compression to use. Use None for no compression. Supported options: âsnappyâ, âgzipâ, âbrotliâ, âlz4â, âzstdâ.
indexbool, default None: If True, include the dataframeâs index(es) in the file output. If False, they will not be written to the file. If None, similar to True the dataframeâs index(es) will be saved. However, instead of being saved as values, the RangeIndex will be stored as a range in the metadata so it doesnât require much space and is faster. Other indexes will be included as columns in the file output.
partition_colslist, optional, default None: Column names by which to partition the dataset. Columns are partitioned in the order they are given. Must be None if path is not a string.
storage_optionsdict, optional: Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with âs3://â, and âgcs://â) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details, and for more examples on storage options refer here.
**kwargs: Additional arguments passed to the parquet library. See pandas io for more details.

Returns:

bytes if no path argument is provided else None

Notes

This function requires either the fastparquet or pyarrow library.

Examples

>>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
>>> df.to_parquet('df.parquet.gzip',
...               compression='gzip')  
>>> pd.read_parquet('df.parquet.gzip')  
   col1  col2
0     1     3
1     2     4

If you want to get a buffer to the parquet content you can use a io.BytesIO object, as long as you donât use partition_cols, which creates multiple files.

>>> import io
>>> f = io.BytesIO()
>>> df.to_parquet(f)
>>> f.seek(0)
0
>>> content = f.read()

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4