A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/mettekou/duckdb-read-stat below:

mettekou/duckdb-read-stat: Read data sets from SAS, Stata, and SPSS from DuckDB with ReadStat

The DuckDB ReadStat Extension

Use this extension to read data sets from SAS, Stata, and SPSS from DuckDB with ReadStat.

Installation is simple through the DuckDB Community Extension repository, just type

INSTALL read_stat FROM community;
LOAD read_stat;

in a DuckDB instance near you.

The extension adds a single DuckDB table function, read_stat, which you use as follows:

-- Read a SAS `.sas7bdat` or `.xpt` file
FROM read_stat('sas_data.sas7bdat');
FROM read_stat('sas_data.xpt');
-- Read an SPSS `.sav`, `.zsav`, or `.por` file
FROM read_stat('spss_data.sav');
FROM read_stat('compressed_spss_data.zsav');
FROM read_stat('portable_spss_data.por');
-- Read a Stata .dta file
FROM read_stat('stata_data.dta');

If the file extension is not .sas7bdat, .xpt, .sav, .zsav, .por, or .dta, use the read_stat function for the right file type with the format parameter:

FROM read_stat('sas_data.other_extension', format = 'sas7bdat');
FROM read_stat('sas_data.other_extension', format = 'xpt');
-- SPSS `.sav` and `.zsav` can both be read through the format `'sav'`
FROM read_stat(
    'spss_data_possibly_compressed.other_extension',
    format = 'sav'
);
FROM read_stat('portable_spss_data.other_extension', format = 'por');
FROM read_stat('stata_data.other_extension', format = 'dta');

Override the file character encoding inferred from the file with an iconv encoding name, see https://www.gnu.org/software/libiconv/:

FROM read_stat('latin1_encoded.sas7bdat', encoding = 'iso-8859-1');

If your files have the proper file extensions and you do not need to override their character encodings, a replacement scan is also available:

-- Read a SAS `.sas7bdat` or `.xpt` file
FROM 'sas_data.sas7bdat';
FROM 'sas_data.xpt';
-- Read an SPSS `.sav`, `.zsav`, or `.por` file
FROM 'spss_data.sav';
FROM 'compressed_spss_data.zsav';
FROM 'portable_spss_data.por';
-- Read a Stata .dta file
FROM 'stata_data.dta';

Clone the repo with submodules

git clone --recurse-submodules <repo>

In principle, compiling this template only requires a C/C++ toolchain. However, this template relies on some additional tooling to make life a little easier and to be able to share CI/CD infrastructure with extension templates for other languages:

Installing these dependencies will vary per platform:

After installing the dependencies, building is a two-step process. Firstly run:

This will ensure a Python venv is set up with DuckDB and DuckDB's test runner installed. Additionally, depending on configuration, DuckDB will be used to determine the correct platform for which you are compiling.

Then, to build the extension run:

This delegates the build process to cargo, which will produce a shared library in target/debug/<shared_lib_name>. After this step, a script is run to transform the shared library into a loadable extension by appending a binary footer. The resulting extension is written to the build/debug directory.

To create optimized release binaries, simply run make release instead.

We recommend to install Ninja and Ccache for building as this can have a significant speed boost during development. After installing, ninja can be used by running:

make clean
GEN=ninja make debug

This extension uses the DuckDB Python client for testing. This should be automatically installed in the make configure step. The tests themselves are written in the SQLLogicTest format, just like most of DuckDB's tests. A sample test can be found in test/sql/<extension_name>.test. To run the tests using the debug build:

or for the release build:

Testing with different DuckDB versions is really simple:

First, run

to ensure the previous make configure step is deleted.

Then, run

DUCKDB_TEST_VERSION=v1.1.2 make configure

to select a different duckdb version to test with

Finally, build and test with

make debug
make test_debug
Using unstable Extension C API functionality

The DuckDB Extension C API has a stable part and an unstable part. By default, this template only allows usage of the stable part of the API. To switch it to allow using the unstable part, take the following steps:

Firstly, set your TARGET_DUCKDB_VERSION to your desired in ./Makefile. Then, run make update_duckdb_headers to ensure the headers in ./duckdb_capi are set to the correct version. (FIXME: this is not yet working properly).

Finally, set USE_UNSTABLE_C_API to 1 in ./Makefile. That's all!


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4