Building R Packages

SciencesPo Intro To Programming 2024

Florian Oswald

29 April, 2024

R and Packages

Questions

  1. Why write our own R package?
  2. How to create an R package?
  3. What are unit tests?

Objectives

  • Learn the RStudio-powered package development workflow.
  • Create a package, and test it.
  • Publish the package to github.
  • Publish the package docs as a self-contained website.

R and Packages

  • We have been using R packages all the time.

  • Each time we say library(xyz) we are using external code provided in the xyz package.

  • You can write your own packages.

What’s the point of Packages?

  1. Extend R functionality.
  2. for researchers: key tool to ensure reproducibilty of findings
  3. for researchers: key tool to organize code in team work
  • Let’s go through some material from the r-pkgs book!

Building a Toy Package

RStudio for the win

  • We do this in RStudio

  • we use the devtools package

  • check you have a recent version:

packageVersion("devtools")
[1] '2.4.5'
  • if not - reinstall.

Let’s Do it!

library(devtools)

# create a package `here`
create_package("~/toypackage")
Package: toypackage
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R (parsed):
    * First Last <first.last@example.com> [aut, cre] (YOUR-ORCID-ID)
Description: What the package does (one paragraph).
License: `use_mit_license()`, `use_gpl3_license()` or friends to pick a
    license
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
  • You see Rstudio jumps to that location

Adding Git

  • Of course we want to track our package with git.
  • We use functions from the usethis package. This is loaded by default when attaching the devtools package (use_git is part of usethis…)
library(devtools)
use_git()
  • Say Yes to everything ✌️

Adding Code

  • We add R source code in the R/ folder.
  • Create as many .R files as you want.
  • It’s good practice to organize tests accompanying source files.
use_r("sayhello")
• Edit 'R/sayhello.R'
• Call `use_test()` to create a matching test file
  • What’s with that use_test() thing? 🤔 Let’s worry about this later.

Ok, but…Adding Code??

  • Let’s add a function to the file R/sayhello.R:
# Notice I'm using = instead of < - because 
# the font of those slides prints it weirdly
hello = function(who){
    paste("hello,",who)
}
  • Now, if this were a simple R script, we could source the R/sayhello.R file into global space and try this out.
  • We don’t want to do that here though. 🤨
  • Instead, we want to load the package, which contains our function.
  • do load_all():
load_all()
ℹ Loading toypackage

Trying out Code

load_all()

  • The load_all() function simulates the process of building, installing, and attaching the toypackage package.
  • This means that all the functions you included in the package will become visible in the global scope (in your console)
  • This is not in general the case: Later on we will fine-tune which functions are visible to the user, and which ones are not!
  • Call the function with your name!
hello("Peter")
[1] "hello, Peter"
  • Great!

Checking the Package

  • R has a rigid set of rules for what a package needs to look like.
  • What files should be where, their names and permissions, such that the structure is nicely uniform across all R packages.
  • Particularly relevant for official packages on CRAN
  • Do this here often:
check()
  • This outputs a bunch of things:
    1. It actually builds our package in a separate process - immune from our current workspace
    2. It runs a battery of checks and returns a report:
0 errors ✔ | 1 warning ✖ | 1 note ✖

Editing DESCRIPTION

  • Open the DESCRIPTION file (or type Ctrl + . and start typing desc)
  • Fill in the obviously missing contents.

Adding a LICENSE

Use a license, any license (Jeff Atwood)

Let’s

use_mit_license()

Documenting with Roxygen

  • Go back to the hello function, place the cursor inside the function body, and do Code > Insert Roxygen Skeleton.
  • You’ll see something like this:
#' Title
#'
#' @param who 
#'
#' @return
#' @export
#'
#' @examples
hello <- function(who){
  paste("hello,",who)
}
  • Each line starting with #' is part of the docstring.
  • The roxygen package can separate those blocks from our code, and produce valid R documentation for us! 🤯

Building Documentation

  • Let’s modify the docstring accordingly.
  • execute the document() function.
  • After that, the documentation is visible to us:
?hello
ℹ Rendering development documentation for "hello"
  • Look in the Help pane in RStudio!

NAMESPACE

  • Did you notice the @export tag in the docstring?
  • when we ran document(), roxygen changed the NAMESPACE file based upon that tag.
  • Go and look at that file!
  • The contents of NAMESPACE specify what is visible to a user who does library(toypackage).
  • Try removing the @export tag, and document() again. Look back at NAMESPACE!

check() again!

check()
0 errors ✔ | 0 warnings ✔ | 0 notes ✔

Time to INSTALL the package

  • Ok, great. Now we have a minimal package that works to a certain extent 🙂.
  • We must install it into our package library, in order to be able to use it like any other package (same as when we did install.packages("ggplot2"))
  • Notice that R installs your packages here:
.libPaths()
[1] "/Users/floswald/Library/R/arm64/4.2/library"                         
[2] "/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library"
  • We install our package into that location with install()
  • Look out for the final message:
* DONE (toypackage)

👏

New Session - Try it Out!

  • Restart Rstudio
  • type into the console
library(toypackage)
  • and then let’s see our cool 😎 function:
hello("John Spencer Blues Explosion")
[1] "hello, John Spencer Blues Explosion"
  • Works! Bingo! 🎉

Automatically Testing Our Code

  • We verified ourselves that this works.
  • We had our own, informal, way to convince ourselves that it works.
  • We knew which steps we had to follow until we would conclude that “yes, this works”.

The Time Factor

If you come back to this in 2 months time you probably

  1. won’t remember all the steps you have taken (above)
  2. won’t be able to reproduce what you tested today!

The Scale Factor

As your package grows, you will find it hard to come back to all components repeatedly, making sure they all still work as intended (now that they may depend on other parts of your code)

Unit Testing and Continuous Integration (CI)

Enter Unit Testing

  • Automatic Unit Testing or Continuous Integration (CI) is our best response to this.
  • We still have to design and write the tests, but we can offload the work to run the tasks repeatedly, and automatically, to a helpful infrastructure.
library(devtools)
use_testthat()
  • then
use_test("sayhello")
• Modify 'tests/testthat/test-sayhello.R'

Writing Unit Tests

  • Ideally, each function in our R/ folder is covered by a corresponding test.

What Is a Test?

The purpose of a test is to verify that some part of your code, a function in most cases, works as intended.

  • Modify 'tests/testthat/test-sayhello.R' like so
test_that("hello function works", {
  who = "James T. Kirk"
  expect_equal(hello(who), paste("hello,",who))
})
  • Ready for 🚀 takeoff?

Running all unit tests

  • You can run each test file separately to try it out (you must do library(testthat) first)
  • It’s better practice to test the entire package though:
> test()
ℹ Testing toypackage

Attaching package: ‘testthat’

The following object is masked from ‘package:devtools’:

    test_file

| F W S  OK | Context
|         1 | sayhello                                         

══ Results ══════════════════════════════════════════════════════
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1 ]
  • Celebrate! 🎉 🥳 🎊

Using other packages

  • Most likely our package would depend some other package as well.
  • Like we could choose the export some of our functions, we now may want to import some functions from elsewhere.
  • Suppose we want to use the dplyr package:
> use_package("dplyr")
✔ Adding 'dplyr' to Imports field in DESCRIPTION
• Refer to functions with `dplyr::fun()`
  • Let’s check the DESCRIPTION file to see what happened.

Hook it up to GitHub!

  • It’s fairly easy to publish our new package to a github repo.
  • Let’s use_github()
use_github()
  • answer all the prompts and end up here!

Adding a Readme file

  • We know by now that readme files are very important on any git repo.
  • Let’s add one here as well!
  • the usethis::use_readme_rmd() function is perfect for this:
usethis::use_readme_rmd()
  • If we want to automatically run our tests on a remote server called github actions, we can call this function as well to set this up:
use_github_actions()
  • let’s re-build the package now. (look for rstudio button install in build tab)

Adding a Vignette

  • Vignette’s are a great feature of R packages. They are full text introductions of the package to a first time user.
  • A tutorial for your package.
  • This is going to be much more verbose and spiked with example input and ouput than the standard documentation.
  • Often it features the main use case of your package.
  • There is an entire chapter on r-pkgs dedicated to this!

Adding the Vignette(s)

usethis::use_vignette("vignette-toypackage-1")

Deploy package documention on a website



  • 🚨 Now we are entering the seriously cool zone of R package development 😎

  • Wouldn’t it be 🤩 amazing if all of our package documentation, the content of our readme, and any explanatory articles we might have written as vignettes, were available on a (free to host!) website which is always up to date?

You Bet It’s Cool 😎

Deploy package documention on a website 3

  • Ready?
usethis::use_pkgdown()
  • pkgdown is a package for website and docs building.
  • Let’s build that site!
pkdown::build_site()
  • Let’s get gh-actions going
usethis::use_pkgdown_github_pages()
  • commit everything and push to github!

Summary

Key Points

  1. RStudio greatly facilitates R package development.
  2. R packages contain code, data and documentation in highly structured fashion.
  3. We are encouraged to run automated unit tests.
  4. It is relatively straightforward to publish the package to github for collaboration.
  5. It is equally straightforward to build and publish a full website with package documentation and vignettes, hosted for free on github.com.