I am attempting to parse a sas7bdat
with a data set page size of 2097152
(~ 2 MiB). When attempting to parse this file I get the error:
Stopping with error: Error when attempting to parse sas7bdat: READSTAT_ERROR_PARSE
SAS file
As noted, the data set page size of the file is 2091752
. The SAS dataset is a randomly created dataset with 2,000 rows and 110 created columns and is ~6MB in size.
I can regenerate the same file, reducing the page size from 2091752
(~ 2 MiB) down to 1048576
(~ 1MiB) and the file parses without issue.
The files I used for testing are in the following location:
To generate the tables linked to above, I manually adjusted using the following in SAS.
/* Unable to parse */ data rand_ds_largepage_err(bufsize=2M); /* elided */ run; /* Able to parse */ data rand_ds_largepage_ok(bufsize=1M); /* elided */ run;SAS
BUFSIZE
option
According to the SAS documentation on the BUFSIZE
option, the page size may be adjusted by altering the system or dataset BUFSIZE
. Again from the documentation, the maximum data set page size that may be set is 2147483647
.
Troubleshooting / Potential FixMAX → sets the page size to the maximum possible number in your operating environment, up to the largest 4-byte, signed integer, which is 2^31–1, or approximately 2 billion bytes.
On line 257 of the readstat_sas.c
file I note the hinfo->header_size
is checked against 1<<20
(1048576 in decimal). If I alter this line to check against 1<<21
(2097152 in decimal), the file parses without issue.
Because the SAS documentation notes that users can set the data set page size to as much as ~ 4GiB, I wonder if the line should be adjusted to check against INT32_MAX
. Obviously making the adjustment may have ramifications that I don't immediately observe as I'm not extremely familiar with the repository.
Finally, I am glad to submit a PR with the change I noted above. Or if you have suggestions on a set of other changes I would need to trace through, I am glad to put in the work. All in all, I am glad to help in any way! Thanks so much for all the effort you (and others) have put into the library!
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4