-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
78 lines (58 loc) · 2.36 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
output: github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
out.width = "100%"
)
library(readsas)
```
# readsas
<!-- badges: start -->

[](https://app.codecov.io/gh/JanMarvin/readsas?branch=main) [](https://janmarvin.r-universe.dev/readsas)
<!-- badges: end -->
R package using Rcpp to parse a SAS file into a data.frame(). Currently
`read.sas` is the main function and feature of this package.
The package allows (experimental) reading of sas7bdat files that are
* (un)compressed
As with other releases of the `read` series, focus is again on being as
accurate as possible. Speed is welcome, but a secondary goal.
## Installation
With `remotes`:
``` r
remotes::install_github("JanMarvin/readsas")
```
With `r-universe`:
``` r
options(repos = c(
janmarvin = 'https://janmarvin.r-universe.dev',
CRAN = 'https://cloud.r-project.org'))
install.packages('readsas')
```
## Basic usage
```{r}
fl <- system.file("extdata", "cars.sas7bdat", package = "readsas")
dd <- read.sas(fl)
head(dd)
```
## Select columns or rows
This should be much faster, since unselected cells of the entire data frame are skipped when reading, and it is memory efficient to load only specific columns or rows. However, the file header is always read in its entirety. If the file header is large enough, it will still take some time to read.
```{r}
fl <- system.file("extdata", "mtcars.sas7bdat", package = "readsas")
dd <- read.sas(fl, select.cols = c("VAR1", "mpg", "hp"),
select.rows = c(2:5), rownames = TRUE)
head(dd)
```
## Thanks
The documentation of the sas7bdat package by Matt Shotwell and Clint Cummins in
their R package [`sas7bdat`](https://github.com/BioStatMatt/sas7bdat), by
Jared Hobbs for the python library
[`sas7bdat`](https://bitbucket.org/jaredhobbs/sas7bdat/src/master/), and by EPAM in
the Java library [`parso`](https://github.com/epam/parso) was crucial.
Without their decryption of the SAS format, this package would not have been
possible.
Further testing was done using the R package
[`haven`](https://github.com/tidyverse/haven) by Hadley Wickam and Evan Miller.