R package
Introduction
R packages give you a structure to organize your code (and data) in a structured and standardized manner. In essence, an R package is a hyper-organized R project of functions. A first tip, whether you want to create an R package or not, it is always a beneficial to organize analyses that are part of a single work-package (e.g. publication, presentation) into an R project (and to use functions for tasks that can be repeated). A prerequisite to facilitate creating an R package is basic knowledge of rstudio-GUI. While RStudio is not necessary to create an R package, it really makes it much easier.
Before digging into the subject here is an short overview of the rstudio GUI and a list of necessary packages that will help you with building your first R package.
Rstudio panels
- editor
- console, terminal, R markdown, jobs
- files, plots, packages, help, viewer
- environment, history, connections, build, git, tutorial
To learn more in detail about project management with RStudio and Rmarkdown please have a look at the following precious resources:
Packages
Install the following two quintessential packages that will make your life much easier.
install.packages('devtools')
install.packages('usethis')
install.packages('pkgdown')
Creating a basic R package
In this section, we’ll guide you through the process of creating a basic R package using the devtools
package.
Create the package repository structure
To initiate your R package, start by creating the necessary directory structure.
In the RStudio menu, navigate to File -> New Project -> New Directory -> R package -> name
and replace “name” with the desired name for your package.
::create("~/twiggy")
devtools#usethis::create_package('~/twiggy')
Adding functions
Once your package structure is set up, you can add functions to it. Open an R script in the R/
directory and include your desired functions. Here, we’ve added two temperature conversion functions as examples:
<- function(F_temp){
F_to_C <- (F_temp - 32) * 5/9;
C_temp return(C_temp);
}
<- function(C_temp){
C_to_F <- (C_temp * 9/5) + 32;
F_temp return(F_temp);
}
Load R package
To test your functions within the package, load it into your R environment using one of the following methods:
- Execute
devtools::load_all()
in the R console. - Use the shortcut command/ctrl + shift + L.
- Navigate to
Build -> Load All
in the RStudio menu.
::load_all() devtools
Exercise
Create a minimum viable R package manually including:
1) Package Directory (twiggy)
2) R-directory (twiggy/R)
3) Functions (twiggy/R/functions.R)
4) Description-file (twiggy/DESCRIPTION)
5) Load package with load_all()
6) Try the functions and check documentation (e.g., ?F_to_C)
In the upcoming sections, we’ll delve into further details on documentation, testing, and package building. This initial setup lays the foundation for developing a well-organized and functional R package.
Document with Roxygen
When seeking assistance in R, the use of a question mark followed by a function (e.g., ?table
) guides you to a help page containing documentation for that specific function. This documentation is a crucial component that you need to create when constructing your own package, and it can be seamlessly added using Roxygen tags. To illustrate, consider the following straightforward example:
#' @title C_to_F
#'
#' @description function to convert Celcius to Farenheit
#'
#' @param C_temp numeric value expressing the temperature in degrees Celcius
#'
#' @return
#'
#' @export
#'
<- function(C_temp){
C_to_F <- (C_temp * 9/5) + 32;
F_temp return(F_temp);
}
The tags are identified by #'
followed by the @
symbol and the tag identifier (e.g. #' @title
). It is generally advisable to include essential tags such as @title, @description, @param, and @return, and then add supplementary tags as per the documentation needs (see Table 1 for an overview). After incorporating Roxygen tags, the document()
function can be used to generate the requisite manual, embedding the documentation. See below for a step-by-step guide.
Table 1. Roxygen Tags Overview ordered by importance and relevance
tag | description |
---|---|
@title | Provides a short, one-line description of the purpose of the documented object. |
@description | Offers a more detailed description of the documented object. This tag is important for providing comprehensive information about the function or object. |
@param | Documents function parameters. Each parameter is typically documented using a separate @param tag, providing details such as the parameter name, type, and description. |
@return | Describes the return value of a function. Similar to @param, it includes details like the type of the return value and a description. |
@examples | Provides examples of how to use the function or code block. Examples are crucial for users to understand how to apply the code in practical situations. |
@author | Specifies the author(s) of the code. It’s a good practice to give credit to those who contributed to the code. |
@seealso | Links to other related functions or documentation. This tag helps users find additional relevant information. |
@details | Offers additional details or information that might be important for users to understand the inner workings or special considerations related to the documented object. |
@import | Indicates the packages that need to be imported for the code to run successfully. This is important for users to know which dependencies are required. |
@source | Specifies the source of the code. This can be useful for giving credit or referencing the origin of the code. |
@references | Provides references or citations related to the code. This is especially important for academic or research-oriented code. |
@export | Indicates that a particular function or object should be exported when the package is loaded. This is essential for package development. |
@keywords | Specifies keywords related to the code. This can aid in searching and categorizing the code. |
@inheritParams | Indicates that the documented function inherits parameters from another function. This can be useful when functions share similar parameter sets. |
Document your functions
Document these functions by adding roxygen tags (@param, @return, @examples, @export, @details, @description, @name) to each function.
To quickly add the roxygen skeleton, put the pointer within a function and then:
- Use the shortcut option/alt + shift + command/ctrl + R.
- Navigate to
Code -> Insert Roxygen Skeleton
in the RStudio menu.
#' @Title C_to_F
#'
#' @param C_temp
#'
#' @return
#'
#' @export
#'
#' @examples
#' # Convert Celcius to Farenheit
#' F <- C_to_F(20)
#'
#' # Convert Celcius to Farenheit, but avoid running it for tests
#' \dontrun {
#' F <- C_to_F(20)
#' }
#'
#'
<- function(C_temp){
C_to_F <- (C_temp * 9/5) + 32;
F_temp return(F_temp);
}
NOTE
The \dontrun
tag is useful when an example should not be automatically executed. This can be necessary for examples that may have side effects, such as downloading large datasets or creating time-consuming computations. How to Use not run: simply include \dontrun{}
in the example you want to exclude from automatic execution.
Document package
The function document()
will automatically create the documentation of the function.
- Execute
devtools::document()
in the R console. - Use the shortcut command/ctrl + shift + B.
::document() devtools
Finally, use load_all()
to reload your package locally
::load_all() devtools
If you inspect the function F_to_C you will see that documentation is now available.
?C_to_F
Exercise
We have now documented the function C_to_F by adding the roxygen tags.
Use the same approach to document the F_to_C function.
Add Data
Adding data to your package enhances its functionality and ensures that users have access to essential datasets. The use_data()
and use_data_raw()
functions simplify the workflow of saving package data in the correct format. Also for data sets various Roxygen tags can be used to add documentation. Follow these steps to include datasets in your package:
Create data-raw directory
The use_data_raw()
function sets up a directory named data-raw
and includes a template R-script, DATASET.R
. This directory could also be used to add your original datasets.
::use_data_raw() usethis
Edit DATASET.R
Edit DATASET.R
to load and format the dataset, saving it as a suitable .rda file using the usethis::use_data()
function.
# example of a created dataset
<- as.Date('2023-09-01')
date <- seq(date,date + 30,1)
timestamp <- sample(seq(10,20,0.1),length(timestamp))
celcius <- data.frame(timestamp=timestamp,celcius=celcius)
temperatures
# create a data directory and store the rda file with use_data()
::use_data(temperatures, overwrite = TRUE) usethis
Document your dataset
In the R/
directory, document your dataset using Roxygen tags. Include essential information such as @name, @description, @format, @source, and variable details. Good practice is to create an R script named data
in which you document your datasets.
# example
#' name Prices of 50,000 round cut diamonds.
#'
#' A dataset containing the prices and other attributes of almost 54,000
#' diamonds.
#'
#' @format A data frame with 53940 rows and 10 variables:
#' \describe{
#' \item{price}{price, in US dollars}
#' \item{carat}{weight of the diamond, in carats}
#' }
#' @source \url{http://www.diamondse.info/}
diamonds
Exercise
Document above created `temperatures` data frame using the Roxygen-tags:
@name, @description, @format and @source.
Make sure to describe each column of the data frame.
Add Vignettes
It is a good idea to add extra detailed documentation of the usage of the R package as vignettes. Vignettes are in essence R markdown files with a specific header. To extend your R package with vignettes use the function below.
::use_vignette("test") usethis
Using your R package in a vignette
To use your R package in your vignette you need to have your package installed locally. The easiest way to install your package is by using the devtools
function install
. Note that this function will install the package locally using the version of your local folder.
::install() devtools
Then, you can add the library refering to your R package within the vignette.
library(twiggy)
Unit tests with testthat
In the R programming language, the testthat package is widely used for unit testing. It provides a simple and expressive framework for writing and running tests, making it an essential tool for ensuring the correctness of your code.
Install and Load testthat
If you haven’t installed the testthat package, you can do so using the following command:
install.packages("testthat")
After installation, load the package in your R script or RStudio session:
library(testthat)
Create a Test File
Create a separate directory within your package for tests, typically named “tests”. Inside this directory, create a new R script for your tests. The function use_testthat()
facilitates this by creating a standard structure for your tests.
::use_testthat() usethis
Writing Your First Test
To write your first test let’s use the function use_test()
. This will automatically create a file test-tests.R
in the directory tests/testthat
and adds a test.
::use_test() usethis
You will see that the following simple test has been added to the file. In this example test_that()
is used to define a test case and expect equal checks if the result of 2 * 2 is equal to 4.
test_that("multiplication works", {
expect_equal(2 * 2, 4)
})
Applied on a function in your package this would look as follows. Let’s assume you have a simple function called add_numbers
in your package that adds two numbers. Here’s an example of how you might write a test for it.
test_that("add_numbers correctly adds two numbers", {
<- add_numbers(2, 3)
result expect_equal(result, 5)
})
Run Your Tests
To run your tests, you can use the function test_file()
which will execute all the tests defined in the specified file.
::test_file('tests/testthat/test-tests.R') testthat
To run tests specifically for your R-package you can use the test
function in devtools
.
::test() devtools
Assertions and Matchers
testthat provides various assertions and matchers to test different conditions. Here are some examples:
Assertion/Matcher | Description |
---|---|
expect_equal(object, expected) |
Test if object is equal to expected . |
expect_identical(object, expected) |
Test if object is identical to expected . |
expect_equivalent(object, expected) |
Test if object is equivalent to expected (ignoring attributes). |
expect_not_equal(object, unexpected) |
Test if object is not equal to unexpected . |
expect_not_identical(object, unexpected) |
Test if object is not identical to unexpected . |
expect_not_equivalent(object, unexpected) |
Test if object is not equivalent to unexpected (ignoring attributes). |
expect_true(condition) |
Test if condition is TRUE . |
expect_false(condition) |
Test if condition is FALSE . |
expect_null(object) |
Test if object is NULL . |
expect_not_null(object) |
Test if object is not NULL . |
expect_error(expr) |
Test if evaluating expr produces an error. |
expect_warning(expr) |
Test if evaluating expr produces a warning. |
expect_length(object, expected_length) |
Test if the length of object is equal to expected_length . |
expect_is(object, class) |
Test if object is an instance of the specified class . |
expect_match(object, pattern) |
Test if object matches the specified regular expression pattern . |
expect_output(expr, expected_output) |
Test if evaluating expr produces the expected output. |
expect_gt(x, y) |
Test if x is greater than y . |
expect_lt(x, y) |
Test if x is less than y . |
expect_ge(x, y) |
Test if x is greater than or equal to y . |
expect_le(x, y) |
Test if x is less than or equal to y . |
expect_setequal(set1, set2) |
Test if the sets set1 and set2 are equal. |
expect_list_equal(list1, list2) |
Test if the lists list1 and list2 are equal. |
expect_true(file.exists(path)) |
Test if a file exists at the specified path . |
expect_true(file.size(path) > 0) |
Test if a file at the specified path is not empty. |
Refer to the testthat documentation for a complete list of assertions and matchers: https://testthat.r-lib.org/
Exercise
1. create the above function add_numbers, including roxygen tags
2. add the above test of the function
3. run the test with devtools::test()
Build
Once your package is ready, you can build your package using devtools::build()
which will create a tarball (a compressed archive) of your package in the ./pkg/
directory.
Before building, it’s a good idea to check your package for common issues using devtools::check()
. The latter function performs a series of checks to ensure that your package meets certain standards and is ready for distribution. These checks help ensure the overall quality, consistency, and compliance of your R package with best practices and standards. It’s recommended to address any issues reported during the devtools::check()
process before submitting your package to CRAN or sharing it with others.
Here are some of the key checks that devtools::check()
performs:
check | description |
---|---|
R CMD check | This is a comprehensive set of checks performed by the R CMD check command. It includes various sub-checks for code syntax, documentation, and package structure. It ensures that your package adheres to the standards set by the R community. |
R CMD build | Checks are made to ensure that your package can be successfully built into a tarball (compressed archive). |
R CMD INSTALL | Installs your package locally to check for installation issues. |
Package dependencies | Checks that the dependencies declared in the DESCRIPTION file are available and can be installed. |
CRAN checks | Checks specific to CRAN (Comprehensive R Archive Network) submission requirements. These include checks for package licensing, proper use of URLs, and other guidelines specified by CRAN. |
Namespace checks | Verifies that your package namespace is properly defined and does not have any conflicts. |
Code style and best practices | Checks for adherence to coding style conventions, as recommended by the R community. |
Documentation checks | Verifies that the documentation is correctly formatted and that all exported functions are documented. |
Examples checks | Runs examples provided in the documentation to ensure they work as intended. |
Vignette checks | If your package includes vignettes, it checks that they can be built without errors. |
Unit tests | If you have unit tests in your package, they are executed to ensure that your functions work as expected. |
License checks | Verifies that your package has a valid license specified in the DESCRIPTION file. |
Encoding checks | Checks that your source code files use a consistent and valid character encoding. |
Check your package
Check makes a list of checks to test the validity of your package. Use one of the following approaches to run the checks:
- Execute
devtools::check()
in the R console. - Use the shortcut shift + command/ctrl + E.
- Navigate to
Build -> Check Package
in the RStudio menu. - Navigate to the
Build
panel and clickCheck
in upper-right panel
::check() devtools
Address any issues that arise during the checking process.
Build your package
Once your package is ready, you can build your package. Use one of the following approaches:
- Execute
devtools::build()
in the R console. - Execute
R CMD build YourPackageName
from the directory in which your package is stored, andYourPackageName
is the R Package directory name. - Use the shortcut command/ctrl + shift + B
- Navigate to
Build -> Build Package
in the RStudio menu.
::build() devtools
This will create a tarball (a compressed archive) of your package in the ./pkg/ directory.
Install and load your package
You can install your package locally for testing using:
install.packages("path/to/YourPackageName_x.y.z.tar.gz", repos = NULL, type = "source")
Replace “path/to/YourPackageName_x.y.z.tar.gz” with the actual path to your package tarball.
Buildignore
To ignore including certain files or directories modify the buildignore file. This will exclude specific directories when building your package. For instance, you might want to ignore the raw-data directory, since you will not need it for the functioning of your R package.
::use_build_ignore(c("yourfolder1", "yourfolder2", "yourfile")) usethis
Exercise
1. Adjust the buildignore file and add the raw-data directory
2. Exclude any other directory that is not necessary
3. perform checks and correct issues
4. build package and search for the zipped file
5. install package
Git
Imagine you have a piece of code, and you’re keen on tracking its changes without losing the original version. The conventional method involves saving scripts as new files, often labeled with indicators like ‘v0’ or a timestamp. Git offers a more seamless way to version your code without the hassle of managing different version files manually. It not only tracks changes made to your files but also equips you with tools to document those changes. While Git’s initial development focused on code versioning, it’s versatile enough to handle smaller datasets. GitHub and GitLab support various text file formats (e.g., csv, fasta), making them ideal for versioning.
This section unfolds a step-by-step guide to (1) initiate a new project on GitHub or GitLab, (2) integrate it with your RStudio, and (3) employ basic git commands for effective version control. Although the example uses GitHub, the process is nearly identical for GitLab.
Create a new repository
Navigate to the GitHub or GitLab website and create a new project with a given name (YourPackageName
). Include a README file during the initialization process and specify other optional settings (public/private, licensing, description).
Copy repository link
Copy the HTTP link of your repository https://github.com/<YourAccount
/<YourPackageName>.git
as shown in the figure below.
Set up repository in RStudio
Open RStudio and choose File -> New Project -> Version Control -> Git
. Paste the Repository URL and specify the Project directory name (YourPackageName
).
Modify gitignore file
Once you created a git project a file that is automatically created is the .gitignore
in the root directory of your project. This is a file in which you can specify which files and directories that should be ignored and thus should not be versioned or hosted in Github or Gitlab.
Make changes, commit and push
Once you have cloned the git repository you have the tools to version the files in your R package. To test this we can make changes into the README
file and submit them to github/gitlab from within RStudio. There are several ways to submit changes in RStudio:
- Use the shortcut command/ctrl + option/alt + M
- Navigate to
Tools -> Version Control -> Commit
in the RStudio menu. - Navigate to the
Git
in upper-right panel in RStudio and click theCommit
button.
A new window pops up in which you:
- Select the files to which you want to add the specific commit message.
- Add a message to describe the changes.
- Click
Commit
to submit the changes to Git. - Click
Push
to push the committed changes to the git repository.
Git can also be used from the terminal:
git add <filename>
git commit -m "commit message"
git push <branchname>
NOTE: It is recommended to have pushed your changes by the end of the day, or when you stop working on a project. Before you start, the next day/moment/moment always make sure to first pull
, to fetch changes made in GitHub or GitLab. Like that you ensure that both the online and local version remain synchronized.
Gitignore:
Exercise
Versioning
Versioning is a crucial aspect of software development that involves assigning unique identifiers to different releases of a software product. It provides a systematic way to manage and communicate changes to a codebase over time. The choice of version number depends on the changes made in your package and the conventions you want to follow. A common convention is to use semantic versioning (SemVer), which consists of three numbers separated by dots (e.g., 1.2.3).
Semantic versioning consists of the following components:
Version | Description |
---|---|
Major version (X.0.0) | Incremented for incompatible API changes. Users may need to modify their code when updating to a new major version. |
Minor version (X.Y.0) | Incremented for backward-compatible new features. Users should be able to update without breaking existing functionality. |
Patch version (X.Y.Z) | Incremented for backward-compatible bug fixes. Indicates that no new features have been added, only bug fixes. |
Versioning your R package
When your intention is to share your R package with the community it is important to use a versioning system. To facilitate versioning for your R package you can use the function usethis::use_version()
. The version of a R package is managed in the DESCRIPTION file. The only thing that use_version
does is modify the DESCRIPTION file and it requests to commit the changes.
::use_version()
usethis
0.0.0.9000.
Current version is 0 to exit)
Which part to increment? (
1: major --> 1.0.0
2: minor --> 0.1.0
3: patch --> 0.0.1
4: dev --> 0.0.0.9001
: Selection
Initialize Versioning
Before making your first official release, initialize versioning. By default, when creating your R-package directory structure the description file will define the first development version as 0.0.0.9000
,indicating that it’s in the early stages and under active development.
To set-up the initial development version you could also run the following code.
::use_version("0.0.0.9000") usethis
First Official Release
When you’re prepared to publish your initial release for users, transition from the development version (e.g., 0.0.0.9000) to version 1.0.0. This change signifies that your package is stable and ready for public use. Optionally, you can include a release name to offer additional context about the significance of this specific version. Various strategies can be employed; for the first release; however, it is generally recommended to designate the inaugural official version as 1.0.0
.
::use_version("1.0.0", release_name = "Inaugural Release") usethis
Development and Changes
As you develop your package, you’ll make further changes. Follow the above guidelines for version updates distinguishing between:
Major Version (X.0.0)
Minor Version (X.Y.0)
Patch Version (X.Y.Z)
Create a news item
Whenever we update the package version we should record what changes have been made. We do this is a NEWS.md file.
::use_news_md()
usethis
'NEWS.md'
✔ Writing 'NEWS.md' ● Modify
Modify the file to record what we have done during the workshop.
# mypkg 1.0.0
* Created the package
* Added the `make_shades()` function
* Added the `plot_colours()` function
* Added a vignette
Building a website
In this section, you’ll learn how to create a YAML configuration file (_pkgdown.yml) for your R package using usethis::use_pkgdown(). We’ll also cover the importance of YAML files and the steps to build and view your package website.
Create YAML configuration
Run the following command to create the _pkgdown.yml file:
::use_pkgdown() usethis
This command creates a basic YAML configuration file in your package directory.
Edit the _pkgdown.yml file
Open the generated _pkgdown.yml file and customize it according to your preferences. This file contains settings for your package website, including the destination, URL, and navigation bar.
Example _pkgdown.yml:
destination: public
url: https://<USERNAME.gitlab.io/<REPOSITORYNAME>/
development:
mode: auto
navbar:
right:
- icon: fa-gitlab
href: https://gitlab.com/<USERNAME/<REPOSITORYNAME>
Adjust the destination and url fields with your GitLab or github username and repository name.
Build your package website
Ensure your package has a DESCRIPTION file and a README file. If not, run:
usethis::use_description()
usethis::use_readme_rmd()
Then, generate your package website using:
::build_site() pkgdown
This command creates a docs/
directory containing HTML files for your website.
Commit changes to Git
Before building the website, commit the changes to your Git repository:
::use_git()
usethis::use_git_commit(message = "Add pkgdown configuration and setup") usethis
Alternatively, commit using the approach as described in the GIT section.
Activate pages
GitHub pages activation
Go to your GitHub repository.
Navigate to the "Settings" tab.
Scroll down to the "GitHub Pages" section.
Set the "Source" branch to main or master (whichever branch you are using).
Confirm the activation.
GitLab pages activation
The activation process for GitLab Pages can vary based on your GitLab instance and permissions. Typically, it involves:
Go to your GitLab repository.
Navigate to the "Settings" or "CI / CD" section.
Look for "Pages" settings.
Configure the Pages settings, including the source branch.
Please refer to the specific documentation of your GitLab instance for detailed instructions.
View your package website
After building the site, your package website will be available at the following link:
://<USERNAME.github.io/<REPOSITORYNAME>/ https
Replace
Large file support (LFS)
Large file support comes in handy if you would like to include large files in your package without bloating your git repository. Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com. To allow LFS navigate to (https://git-lfs.com/)[https://git-lfs.com/][https://git-lfs.com/]) to download and install the extension.
Set up LFS
Once downloaded and installed, set up Git LFS for your user account by running:
git lfs install
You only need to run this once per user account.
Select file types
In each Git repository where you want to use Git LFS, select the file types you’d like Git LFS to manage (or directly edit your .gitattributes). You can configure additional file extensions at anytime.
git lfs track "*.html"
Check .gitattributes
Now make sure .gitattributes is tracked:
git add .gitattributes
Shortcuts
MacOS / Windows | R command | Rstudio Menu |
---|---|---|
command/ctrl + shift + E | devtools::check() | build -> Check Package |
command/ctrl + shift + B | devtools::build() | build -> Build Source Package |
command/ctrl + shift + L | devtools::load_all() | build -> load all |
command/ctrl + shift + D | devtools::document() | |
command/ctrl + option/alt + M | Tools -> Version Control -> Commit | |
command/ctrl + shift + C | Code -> Comment/Uncomment Lines | |
option/alt + shift + command/ctrl + R | Code -> Insert Roxygen Skeleton |
EXTRA
todor
install.packages('todor')