CRAN Task View: Missing Data

Maintainer:Julie Josse, Imke Mayer, Nicholas Tierney, and Nathalie Vialaneix (r-miss-tastic team)
Contact:r-miss-tastic at clementine.wf
Version:2022-08-11
URL:https://CRAN.R-project.org/view=MissingData
Source:https://github.com/cran-task-views/MissingData/
Contributions:Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide.
Citation:Julie Josse, Imke Mayer, Nicholas Tierney, and Nathalie Vialaneix (r-miss-tastic team) (2022). CRAN Task View: Missing Data. Version 2022-08-11. URL https://CRAN.R-project.org/view=MissingData.
Installation:The packages from this task view can be installed automatically using the ctv package. For example, ctv::install.views("MissingData", coreOnly = TRUE) installs all the core packages or ctv::update.views("MissingData") installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details.

Missing data are very frequently found in datasets. Base R provides a few options to handle them using computations that involve only observed data (na.rm = TRUE in functions mean, var, … or use = complete.obs|na.or.complete|pairwise.complete.obs in functions cov, cor, …). The base package stats also contains the generic function na.action that extracts information of the NA action used to create an object. In addition, the package ie2misc contains a dyadic operator + that behaves differently than the original + operator regarding missing data.

These basic options are complemented by many packages on CRAN. In this task view, we focused on the most important ones, which have been published more than one year ago and are regularly updated. The task view is structured into main topics:

In addition to the present task view, this reference website on missing data might also be helpful. Complementary information might also be found in TimeSeries, SpatioTemporal, Survival, and OfficialStatistics. Note that most packages covering temporal, and spatio-temporal interpolation and censored data are not covered by the Missing Data task view.

If you think we have missed some important packages in this list, please e-mail the maintainers or submit an issue or pull request in the GitHub repository linked above.

Exploration of missing data

Likelihood based approaches

Single imputation

Multiple imputation

Some of the above mentioned packages can also handle multiple imputations.

In addition, mitools provides a generic approach to handle multiple imputation in combination with any imputation method, NADIA provides a uniform interface to compare the performance of several imputation algorithms, cobalt computes balance tables and plots for multiply imputed datasets, and SynthTools provides confidence intervals for multiply imputed datasets.

Weighting methods

Specific types of data

Specific tasks

Specific application fields

CRAN packages

Core:Amelia, hot.deck, imputeTS, jomo, mice, missMDA, naniar, softImpute, VIM, yaImpute.
Regular:accelmissing, ade4, AeRobiology, alleHap, areal, bayesCT, BayesMallows, bcROCsurface, biclustermd, BIFIEsurvey, BLOQ, BMTAR, bnstruct, bootImpute, brxx, CALIBERrfimpute, cat, cglasso, ClustImpute, CMF, cmfrec, cobalt, CoImp, cold, convergEU, CRTgeeDR, dejaVu, denoiseR, DescTools, dlookr, dosearch, DrImpute, DTSg, DTWBI, DTWUMI, ECLRMC, edmcr, eechidna, eicm, eigenmodel, eRm, experiment, fastLink, FHDI, FILEST, filling, forecast, foster, FSMUMI, gapfill, grf, GSE, gsynth, HardyWeinberg, Hmisc, iai, iCellR, icenReg, idem, ie2misc, imp4p, impimp, imputeFin, imputeMulti, imputeR, imputeTestbench, InformativeCensoring, ipw, IPWboxplot, irrNA, Iscores, isni, isotree, JointAI, kmi, lavaan, LNIRT, lodi, lori, LOST, ltm, MatchThem, mde, mdgc, mdmb, memisc, metagear, metasens, metavcov, MGMM, mi, miceadds, miceFast, micemd, miceRanger, mimi, mirt, misaem, missCompare, missForest, missingHE, missMethods, missRanger, missSBM, misty, mitml, mitools, miWQS, mix, mixture, MKinfer, MLCIRTwithin, mlmi, MMDai, momentuHMM, monomvn, NADIA, naivebayes, nipals, NIRStat, NMADiagT, norm, norm2, NPBayesImputeCat, OpenMx, padr, pan, phylin, plsRbeta, plsRglm, ppmSuite, prefmod, PReMiuM, primePCA, prophet, pseval, psfmi, qgtools, QTLRel, Qtools, QUALYPSO, randomForest, RBtest, RCAL, retroharmonize, Rmagic, rMIDAS, RMixtComp, RMixtCompIO, RMixtCompUtilities, RNAseqNet, robCompositions, robustrank, robustrao, ROptSpace, Rphylopars, rrcovNA, rsem, rsparse, rtop, samon, sanon, SAVER, SCAT, scorecardModelUtils, semTools, sievePH, simFrame, simglm, simputation, simsem, sjlabelled, sjmisc, smcfcs, spacetime, StAMPP, StatMatch, StempCens, stlplus, StratifiedRF, swgee, SynthTools, TAM, targeted, tensorBF, TestDataImputation, tidyr, timeSeries, TreeSim, tsibble, tsrobprep, VarSelLCM, wrangle, wrProteo, xts, zCompositions, zoo.
Archived:BaBooN, cutoffR, ForImp, lqr.

Related links

Other resources