Introduction to the fastverse

An Extensible Suite of High-Performance and Low Dependency Packages for Statistical Computing and Data Manipulation in R

Sebastian Krantz

2022-05-31

The fastverse is an extensible suite of R packages, developed independently by various people, that jointly contribute to the objectives of:

The fastverse installs 6 core packages (data.table, collapse, matrixStats, kit, magrittr and fst) that are (by default) harmonized and attached with library(fastverse). These packages were selected because they provide high quality compiled code for most common statistical and data manipulation tasks, have carefully managed APIs, jointly depend only on base R and Rcpp, and work remarkably well together.

library(fastverse)
# -- Attaching packages --------------------------------------- fastverse 0.2.4 --
# v data.table  1.14.2     v collapse    1.8.3 
# v magrittr    2.0.3      v matrixStats 0.61.0
# v kit         0.0.12     v fst         0.9.8

The fastverse package then provides functionality familiar from the tidyverse package, such as checking and reporting namespace clashes, and utilities for updating packages, listing dependencies etc…

# Checking for any updates
fastverse_update()
# The following packages are out of date:
# 
#  * matrixStats (0.61.0 -> 0.62.0)
#  * fstcore     (0.9.8 -> 0.9.12)
# 
# Start a clean R session then run:
# install.packages(c("matrixStats", "fstcore"))

A key feature of the fastverse is that it can liberally be extended with other packages that can be loaded and managed using the tools this package provides. Users are encouraged to make of use the features described in the remainder of this vignette to extend the fastverse or even create ‘verses’ of packages that suit their personal analysis needs. A selection of suggested packages is provided on the website1.

Extending the fastverse for the Session

After the core packages have been attached with library(fastverse), it is possible to extend the fastverse for the current session by adding any number of additional packages with fastverse_extend(). This will attach the packages and, by default, check for namespace clashes with attached packages, as well as among the added packages.

# Extend the fastverse for the session
fastverse_extend(xts, roll, fasttime)
# -- Attaching extension packages ----------------------------- fastverse 0.2.4 --
# v xts      0.12.1     v fasttime 1.0.2 
# v roll     1.1.6
# -- Conflicts ------------------------------------------ fastverse_conflicts() --
# x xts::first() masks data.table::first()
# x xts::last()  masks data.table::last()

# See that these are now part of the fastverse
fastverse_packages()
#  [1] "data.table"  "magrittr"    "kit"         "collapse"    "matrixStats"
#  [6] "fst"         "xts"         "roll"        "fasttime"    "fastverse"

# They are also saved in a like-named option 
options("fastverse.extend")
# $fastverse.extend
# [1] "xts"      "roll"     "fasttime"

All fastverse packages (or particular packages) can be detached using fastverse_detach.

# Detaches all packages (including the fastverse) but does not (default) unload them
fastverse_detach()

For programming purposes it is also possible to pass vectors of packages to both fastverse_extend and fastverse_detach2. The defaults of fastverse_detach are set such that detaching is very ‘light’. Packages are not unloaded and all fastverse options set for the session are kept.

# Extensions are still here ...
options("fastverse.extend")
# $fastverse.extend
# [1] "xts"      "roll"     "fasttime"

# Thus attaching the fastverse again will include them
library(fastverse)
# -- Attaching packages --------------------------------------- fastverse 0.2.4 --
# v data.table  1.14.2     v fst         0.9.8 
# v magrittr    2.0.3      v xts         0.12.1
# v kit         0.0.12     v roll        1.1.6 
# v collapse    1.8.3      v fasttime    1.0.2 
# v matrixStats 0.61.0
# -- Conflicts ------------------------------------------ fastverse_conflicts() --
# x xts::first() masks data.table::first()
# x xts::last()  masks data.table::last()

‘Harder’ modes of detaching can be achieved using arguments unload = TRUE (and force = TRUE) to (forcefully) detach and unload fastverse packages, and/or session = TRUE which will clear all fastverse options set3.

# Detaching and unloading all packages and clearing options
fastverse_detach(session = TRUE, unload = TRUE)
# Warning: 'magrittr' namespace cannot be unloaded:
#   namespace 'magrittr' is imported by 'stringr' so cannot be unloaded

fastverse_detach can also be used to detach any other attached packages not part of the fastverse.

Since options("fastverse.extend") keeps track of which packages were added to the fastverse for the current session, it is also possible to set it before loading the fastverse e.g. 

options(fastverse.extend = c("dygraphs", "tidyfast"))
library(fastverse)
# -- Attaching packages --------------------------------------- fastverse 0.2.4 --
# v data.table  1.14.2      v matrixStats 0.61.0 
# v magrittr    2.0.3       v fst         0.9.8  
# v kit         0.0.12      v dygraphs    1.1.1.6
# v collapse    1.8.3       v tidyfast    0.2.1

fastverse_detach(session = TRUE)

Permanent Extensions

fasvtverse_extend and fastverse_detach both have an argument permanent = TRUE which can be used to make these changes persist across R sessions. This is implemented using a global configuration file saved to the package directory4.

For example, suppose most of my work involves time series analysis, and I would like to add xts, zoo, roll, and dygraphs to my fastverse. Let’s say I also don’t really use the fst file format, and I don’t really need matrixStats either as I can do most of the time series statistics I need with base R and collapse. Let’s finally say that I don’t want xts::first and xts::last to mask data.table::first and data.table::last.

Then I could permanently modify my fastverse as follows5:

library(fastverse)
# -- Attaching packages --------------------------------------- fastverse 0.2.4 --
# v data.table  1.14.2     v collapse    1.8.3 
# v magrittr    2.0.3      v matrixStats 0.61.0
# v kit         0.0.12     v fst         0.9.8

# Adding extensions
fastverse_extend(xts, zoo, roll, dygraphs, permanent = TRUE)
# -- Attaching extension packages ----------------------------- fastverse 0.2.4 --
# v xts      0.12.1      v dygraphs 1.1.1.6
# v roll     1.1.6
# -- Conflicts ------------------------------------------ fastverse_conflicts() --
# x zoo::as.Date()         masks base::as.Date()
# x zoo::as.Date.numeric() masks base::as.Date.numeric()
# x xts::first()           masks data.table::first()
# x xts::last()            masks data.table::last()

# Removing some core packages
fastverse_detach(data.table, fst, matrixStats, permanent = TRUE)

# Adding data.table again, so it is attached last
fastverse_extend(data.table, permanent = TRUE)
# -- Attaching extension packages ----------------------------- fastverse 0.2.4 --
# v data.table 1.14.2
# -- Conflicts ------------------------------------------ fastverse_conflicts() --
# x data.table::first() masks xts::first()
# x data.table::last()  masks xts::last()

To verify our modification, we can see the order in which the packages are attached, and do a conflict check:

# This will be the order in which packages are attached
fastverse_packages(include.self = FALSE)
# [1] "magrittr"   "kit"        "collapse"   "xts"        "zoo"       
# [6] "roll"       "dygraphs"   "data.table"

# Check conflicts to make sure data.table functions take precedence
fastverse_conflicts()
# -- Conflicts ------------------------------------------ fastverse_conflicts() --
# x zoo::as.Date()         masks base::as.Date()
# x zoo::as.Date.numeric() masks base::as.Date.numeric()
# x data.table::first()    masks xts::first()
# x data.table::last()     masks xts::last()

Note that options("fastverse.extend") is still empty, because we have written those changes to a config file6. Now lets see if our permanent modification worked:

# detach all packages and clear all options
fastverse_detach(session = TRUE)
library(fastverse) 
# -- Attaching packages --------------------------------------- fastverse 0.2.4 --
# v magrittr   2.0.3       v zoo        1.8.10 
# v kit        0.0.12      v roll       1.1.6  
# v collapse   1.8.3       v dygraphs   1.1.1.6
# v xts        0.12.1      v data.table 1.14.2
# -- Conflicts ------------------------------------------ fastverse_conflicts() --
# x zoo::as.Date()         masks base::as.Date()
# x zoo::as.Date.numeric() masks base::as.Date.numeric()
# x data.table::first()    masks xts::first()
# x data.table::last()     masks xts::last()

After this permanent modification, the fastverse can still be extend for the session:

# Extension for the session
fastverse_extend(Rfast2, coop)
# -- Attaching extension packages ----------------------------- fastverse 0.2.4 --
# v Rfast2 0.1.1     v coop   0.6.3
# -- Conflicts ------------------------------------------ fastverse_conflicts() --
# x coop::covar() masks Rfast2::covar()

# These packages go here
options("fastverse.extend")
# $fastverse.extend
# [1] "Rfast2" "coop"

# This fetches packages from both the file and the option
fastverse_packages()
#  [1] "magrittr"   "kit"        "collapse"   "xts"        "zoo"       
#  [6] "roll"       "dygraphs"   "data.table" "Rfast2"     "coop"      
# [11] "fastverse"

As long as the current installation of the fastverse is kept, these modifications will persist across R sessions. Needless to say this is not ideal as reinstallation of the fastverse will remove the config file. Therefore, the fastverse also offers a persistent and more flexible mechanism to configure it inside projects.

Custom fastverse Configurations for Projects

You can put together a custom collection of packages for a project, and load / manage them with library(fastverse).

For this you need to include a configuration file named .fastverse (no file extension) inside a project directory, and place inside that file the names of packages to be loaded when calling library(fastverse)7. Note that all packages to be loaded as core fastverse for your project need to be included in that file, in the order they should be attached.

In addition, you can set global options and environment variables, either before or after the list of packages. Options must be prefixed with _opt_ and environment variables with _env_, and either must be placed on separate lines. For example, including a .fastverse script like this:

_opt_collapse_mask = c("manip", "helper)
_opt_fastverse.install = TRUE

data.table, kit, magrittr, collapse, qs, fixest, 
ranger, robustbase, decompr

_opt_max.print = 100
_opt_kit.nThread = 4
_env_NCRAN = TRUE

in a project directory, and placing library(fastverse) at the top of an R script in the project will first set options(collapse_mask = c("manip", "helper), fastverse.install = TRUE)8, then attach all the packages in the order provided, and then set options(max.print = 100, kit.nThread = 4) and Sys.setenv(NCRAN = TRUE). Note that packages can be spread across multiple lines, but need to be together i.e. they cannot be separated by options.

Using the fastverse to jointly load important packages and set important options can facilitate package management inside projects and serve as a bridge between loading packages individually and using more rigorous package and namespace management solutions such as renv, conflicted, box or import.

At the most basic level, loading packages with the fastverse displays the package versions and checks namespace conflicts, helping you spot issues that might arise as packages are updated. It also allows you to easily check the dependencies and update status of packages used in your project with fastverse_sitrep(), and update if necessary with fastverse_update().

Using a config file in a project will ignore any global configuration as discussed in the previous section. You can still extend the fastverse inside a project session using fastverse_extend, or options("fastvers.extend") before library(fastverse)9.

Creating Separate Package-Verses

An added feature in version 0.2.0 if the fastverse is the ability to create wholly separate and fully customizable verses - with the fastverse_child() function. Let’s say I would like to create a verse for time series analysis that I want to keep separate from the fastverse. This is easily done using e.g.

fastverse_child(
  name = "tsverse", 
  title = "Time Series Package Verse", 
  pkg = c("xts", "roll", "zoo", "tsbox", "urca", "tseries", "tsutils", "forecast"), 
  maintainer = 'person("GivenName", "FamilyName", role = "cre", email = "your@email.com")',
  dir = "C:/Users/.../Documents", 
  theme = "tidyverse")

By default (install = TRUE, keep.dir = TRUE) the package is installed and a source directory is created under dir/name, allowing further edits to the package. Such fastverse children inherit 90% of the functionality of the fastverse package: they are not permanently globally extensible and can not bear children themselves, but can be configured for projects10 and extended in the session. The function uses a prepared ‘child’ branch of the GitHub repository, and thus does not require any further packages such as devtools.

Dependencies, Situational Reports and Updating

Just like it’s tidyverse equivalent, fastverse_deps() (recursively) determines the joint dependencies of fastverse packages and also checks local versions against CRAN versions.

# Recursively determine the joint dependencies of the current fastverse configuration
fastverse_deps(recursive = TRUE) # Returns a data frame
#          package       cran      local behind
# 1       magrittr      2.0.3      2.0.3  FALSE
# 2            kit     0.0.11     0.0.12  FALSE
# 3       collapse      1.7.6      1.8.3  FALSE
# 4            xts     0.12.1     0.12.1  FALSE
# 5            zoo     1.8.10     1.8.10  FALSE
# 6           roll      1.1.6      1.1.6  FALSE
# 7       dygraphs    1.1.1.6    1.1.1.6  FALSE
# 8     data.table     1.14.2     1.14.2  FALSE
# 9         Rfast2      0.1.3      0.1.1   TRUE
# 10          coop      0.6.3      0.6.3  FALSE
# 11          RANN      2.6.1      2.6.1  FALSE
# 12          Rcpp    1.0.8.3    1.0.8.3  FALSE
# 13 RcppArmadillo 0.11.1.1.0 0.11.1.1.0  FALSE
# 14       RcppGSL     0.3.11     0.3.10   TRUE
# 15  RcppParallel      5.1.5      5.1.4   TRUE
# 16  RcppZiggurat      0.1.6      0.1.6  FALSE
# 17         Rfast      2.0.6      2.0.4   TRUE
# 18     base64enc      0.1.3      0.1.3  FALSE
# 19        digest     0.6.29     0.6.29  FALSE
# 20       fastmap      1.1.0      1.1.0  FALSE
# 21     htmltools      0.5.2      0.5.2  FALSE
# 22   htmlwidgets      1.5.4      1.5.4  FALSE
# 23      jsonlite      1.8.0      1.8.0  FALSE
# 24       lattice    0.20.45    0.20.44   TRUE
# 25         rlang      1.0.2      1.0.2  FALSE
# 26          yaml      2.3.5      2.3.5  FALSE

Additional flexibility is offered by the pkg argument allowing dependency and update status checks for any other packages. fastverse_sitrep() displays the same information in a more elegant printout, also showing the version of R, and whether any global or project-level configuration files - as discussed in previous sections - are used.

# Check versions and update status of packages and dependencies
fastverse_sitrep() # default is recursive = FALSE
# -- fastverse 0.2.4: Situation Report -------------------------------- R 4.1.1 --
#  * Global config file: TRUE
#  * Project config file: FALSE
# -- Core packages --------------------------------------------------------------- 
#  * magrittr      (2.0.3)
#  * kit           (0.0.11)
#  * collapse      (1.7.6)
#  * xts           (0.12.1)
#  * zoo           (1.8.10)
#  * roll          (1.1.6)
#  * dygraphs      (1.1.1.6)
#  * data.table    (1.14.2)
# -- Extension packages ---------------------------------------------------------- 
#  * Rfast2        (0.1.1 < 0.1.3)
#  * coop          (0.6.3)
# -- Dependencies ---------------------------------------------------------------- 
#  * RANN          (2.6.1)
#  * Rcpp          (1.0.8.3)
#  * RcppArmadillo (0.11.1.1.0)
#  * RcppParallel  (5.1.4 < 5.1.5)
#  * Rfast         (2.0.4 < 2.0.6)
#  * htmltools     (0.5.2)
#  * htmlwidgets   (1.5.4)
#  * lattice       (0.20.44 < 0.20.45)

fastverse_update() can be used to (default) print an install.packages() statement to update fastverse packages and dependencies, or to install updates straight away (install = TRUE). In all three functions, check.deps = FALSE can be specified to exclude dependencies of fastverse packages, recursive = TRUE can be used to check all dependencies, and include.self = TRUE can be used to also check for updates of the fastverse package itself.

Other fastverse Options

Apart from "fastverse.extend", the fastverse also has options "fastverse.install", "fastverse.styling" and "fastverse.quiet"11. Setting options(fastverse.install = TRUE) before library(fastverse) will make sure any packages missing on your system will be installed beforehand. This can also be done ex-post using the fastverse_install() function. Setting options(fastverse.styling = FALSE) will disable coloured text printed to the R console (as done for this vignette). options(fastverse.quiet = TRUE) will omit any messages printed from library(fastverse) and fastverse_extend():

fastverse_detach()
options(fastverse.quiet = TRUE)
library(fastverse) # Nothing to see here

# This gives lots of function clashes with data.table, but they are not displayed in quiet mode
fastverse_extend(lubridate)

If you only want to omit a function clash check when calling fastverse_extend, you can also use fastverse_extend(..., check.conflicts = FALSE).

Conclusion

The fastverse was developed principally for 2 reasons: to promote quality high-performance software development for R, and to provide a flexible approach to package loading and management in R, particularly for users wishing to combine various high-performance packages in statistical workflows. To the extent that high-performance software development in R continues to prioritize low-dependency and stable APIs, complex statistical and project workflows can be developed without sophisticated package management solutions.

# Resetting the fastverse to defaults (clearing all permanent extensions and options)
fastverse_reset()
# Detaching 
fastverse_detach()

  1. Let me know about other packages you think should be featured there.↩︎

  2. In particular, the ... expression is first captured using substitute(c(...)), and then evaluated inside tryCatch. If this evaluation fails or did not result in a character vector, the expression is coerced to character.↩︎

  3. options("fastverse.quiet") and options("fastverse.styling") will only be cleared if all packages are detached. If selected packages are detached, they are removed from options("fastverse.extend").↩︎

  4. Thus it will be removed when the fastverse is reinstalled.↩︎

  5. I note that namespace conflicts can also be detected and handled with the conflicted package on CRAN.↩︎

  6. When fetching the names of fastverse packages, fastverse_packages first checks any config file and then checks options("fastverse.extend").↩︎

  7. You can place package names in that file any manner you deem suitable: separated using spaces or commas, on one or multiple lines. Note that the file will be read from left to right and from top to bottom. Packages are attached in the order found in the file.↩︎

  8. Setting options(fastverse.install = TRUE) before loading the packages will make sure any packages missing on your system will be installed beforehand. options(collapse_mask = ...) can be used to make base R and dplyr functions with faster versions provided in the collapse package. See help("collapse-options").↩︎

  9. If you populate options("fastvers.extend") before calling library(fastverse), with _opt_fastverse.install = TRUE set in .fastverse, the availability of these packages will also be guaranteed.↩︎

  10. Using, in this case, a .tsverse config file.↩︎

  11. I note that at this point in time it is not possible to permanently set options("fastverse.quiet") or options("fastverse.styling").↩︎