haven::read_sas
cannot read a SAS data file with page size 16 MiB (16777216 bytes). Some data files with sizes slightly under 16 MiB also fail to read.
I would expect the attached sas7bdat files (which I have zipped to keep filesize under 10MB) to be read in by haven::read_sas
and give a 10,000 row tibble with one column empty
consisting only of empty strings.
20221123 - haven bug report.zip
haven::read_sas('test_16766976.sas7bdat') # Succeeds # # A tibble: 10,000 x 1 # empty # <chr> # 1 "" # 2 "" # 3 "" # 4 "" # 5 "" # 6 "" # 7 "" # 8 "" # 9 "" # 10 "" # # ... with 9,990 more rows haven::read_sas('test_16776192.sas7bdat') # Fails # Error in df_parse_sas_file(spec_data, spec_cat, encoding = encoding, catalog_encoding = catalog_encoding, : # Failed to parse <snip>/test_16776192.sas7bdat: Unable to allocate memory. haven::read_sas('test_16777216.sas7bdat') # Fails # Error in df_parse_sas_file(spec_data, spec_cat, encoding = encoding, catalog_encoding = catalog_encoding, : # Failed to parse <snip>/test_16777216.sas7bdat: Unable to allocate memory.
I generated these files in SAS using
* libname out "appropriate/path/here"; data out.test_16777216 (bufsize=16777216 compress=no); length empty $3000; do i = 1 to 10000; output; end; drop i; run; * PROC CONTENTS to verify page size; proc contents data=out.test_16777216 varnum; run; data out.test_16776192 (bufsize=16776192 compress=no); length empty $3000; do i = 1 to 10000; output; end; drop i; run; proc contents data=out.test_16776192 varnum; run; data out.test_16766976 (bufsize=16766976 compress=no); length empty $3000; do i = 1 to 10000; output; end; drop i; run; proc contents data=out.test_16766976 varnum; run;
Workaround: Set the default page size in SAS to 8MB with -BUFSIZE 8M
or on a case-by-case basis. The default page size for my operating environment is 16M.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4