fastTopics

R-CMD-check CircleCI build status codecov

fastTopics is an R package implementing fast, scalable optimization algorithms for fitting topic models (“grade of membership” models) and non-negative matrix factorizations to count data. The methods exploit the special relationship between the multinomial topic model (also “probabilistic latent semantic indexing”) and Poisson non-negative matrix factorization. The package provides tools to compare, annotate and visualize model fits, including functions to efficiently create “structure plots” and identify key features in topics. The fastTopics package is a successor to the CountClust package.

If you find a bug, or you have a question or feedback on this software, please post an issue.

Citing this work

If you find the fastTopics package or any of the source code in this repository useful for your work, please cite:

Kushal K. Dey, Chiaowen Joyce Hsiao and Matthew Stephens (2017). Visualizing the structure of RNA-seq expression data using grade of membership models. PLoS Genetics 13, e1006599.

Peter Carbonetto, Kevin Luo, Kushal Dey, Joyce Hsiao and Matthew Stephens (2021). fastTopics: fast algorithms for fitting topic models and non-negative matrix factorizations to count data. R package version 0.4-11. https://github.com/stephenslab/fastTopics

License

Copyright (c) 2019-2021, Peter Carbonetto and Matthew Stephens.

All source code and software in this repository are made available under the terms of the MIT license.

Quick Start

Install and load the package:

remotes::install_github("stephenslab/fastTopics")
library(fastTopics)

Note that installing the package will require a C++ compiler setup that is appropriate for the version of R installed on your computer. For details, refer to the documentation on the CRAN website.

For guidance on using fastTopics to analyze gene expression data, see the single-cell RNA-seq vignette, part 1 and part 2.

Also, try running the small example that illustrates the fast model fitting algorithms:

example("fit_poisson_nmf")

See the package documentation for more information.

Developer notes

To prepare the package for CRAN, remove the “Remotes” field in the DESCRIPTION file, and set on_cran <- TRUE in helper_functions.R, then run R CMD build fastTopics to build the source package.

This is the command used to check the package before submitting to CRAN:

library(rhub)
check_for_cran(".",show_status = TRUE,
  env_vars = c(`_R_CHECK_FORCE_SUGGESTS_` = "false",
               `_R_CHECK_CRAN_INCOMING_USE_ASPELL_` = "true"))

Credits

The fastTopics R package was developed by Peter Carbonetto, Kevin Luo, Kushal Dey, Joyce Hsiao and Matthew Stephens at the University of Chicago.