A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/WizardMac/ReadStat/issues/249 below:

Unable to parse sas7bdat when data set page size

Issue

I am attempting to parse a sas7bdat with a data set page size of 2097152 (~ 2 MiB). When attempting to parse this file I get the error:

Stopping with error: Error when attempting to parse sas7bdat: READSTAT_ERROR_PARSE
SAS file

As noted, the data set page size of the file is 2091752. The SAS dataset is a randomly created dataset with 2,000 rows and 110 created columns and is ~6MB in size.

I can regenerate the same file, reducing the page size from 2091752 (~ 2 MiB) down to 1048576 (~ 1MiB) and the file parses without issue.

The files I used for testing are in the following location:

To generate the tables linked to above, I manually adjusted using the following in SAS.

/* Unable to parse */
data rand_ds_largepage_err(bufsize=2M);
  /* elided */
run;

/* Able to parse */
data rand_ds_largepage_ok(bufsize=1M);
  /* elided */
run;
SAS BUFSIZE option

According to the SAS documentation on the BUFSIZE option, the page size may be adjusted by altering the system or dataset BUFSIZE. Again from the documentation, the maximum data set page size that may be set is 2147483647.

MAX → sets the page size to the maximum possible number in your operating environment, up to the largest 4-byte, signed integer, which is 2^31–1, or approximately 2 billion bytes.

Troubleshooting / Potential Fix

On line 257 of the readstat_sas.c file I note the hinfo->header_size is checked against 1<<20 (1048576 in decimal). If I alter this line to check against 1<<21 (2097152 in decimal), the file parses without issue.

Because the SAS documentation notes that users can set the data set page size to as much as ~ 4GiB, I wonder if the line should be adjusted to check against INT32_MAX. Obviously making the adjustment may have ramifications that I don't immediately observe as I'm not extremely familiar with the repository.

Finally, I am glad to submit a PR with the change I noted above. Or if you have suggestions on a set of other changes I would need to trace through, I am glad to put in the work. All in all, I am glad to help in any way! Thanks so much for all the effort you (and others) have put into the library!


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4