GFM: installation and simulated example

Wei Liu

2022-01-04

Install the GFM

This vignette provides an introduction to the R package GFM, where the function gfm implements the model GFM, Generalized Factor Model for ultra-high dimensional variables with mixed types. The package can be installed with the command:

library(remotes)

remotes::install_github("feiyoung/GFM")

or

install.packages("GFM")

The package can be loaded with the command:

library("GFM")
#> Loading required package: doSNOW
#> Warning: package 'doSNOW' was built under R version 4.0.5
#> Loading required package: foreach
#> Loading required package: iterators
#> Loading required package: snow
#> Warning: package 'snow' was built under R version 4.0.5
#> Loading required package: parallel
#> 
#> Attaching package: 'parallel'
#> The following objects are masked from 'package:snow':
#> 
#>     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#>     clusterExport, clusterMap, clusterSplit, makeCluster, parApply,
#>     parCapply, parLapply, parRapply, parSapply, splitIndices,
#>     stopCluster

Fit GFM model using simulated data

Generate data with homogeneous normal variables

First, we generate the data with homogeneous normal variables.

Then, we set the algorithm parameters and fit model

Third, we fit the GFM model with user-specified number of factors.

The number of factors can also be determined by data-driven manners.

Generate data with heterogeous normal variables

First, we generate the data with heterogeous normal variables and set the parameters of algorithm.

Third, we fit the GFM model with user-specified number of factors and compare the results with that of linear factor models.

The number of factors can also be determined by data-driven manners.

Generate data with Count(Poisson) variables

First, we generate the data with Count(Poisson) variables and set the parameters of algorithm.

Second, we we fit the GFM models in the parallel manner.

Third, we compare the results with that of linear factor models.

Generate data with the mixed-types of Poisson and Binomial variables

First, we generate the data with Count(Poisson) variables and set the parameters of algorithm. Then fit the GFM model with user-specified number of factors.

Third, we compare the results with that of linear factor models.

Compare with linear factor models

Session information

sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 22000)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=C                              
#> [2] LC_CTYPE=Chinese (Simplified)_China.936   
#> [3] LC_MONETARY=Chinese (Simplified)_China.936
#> [4] LC_NUMERIC=C                              
#> [5] LC_TIME=Chinese (Simplified)_China.936    
#> 
#> attached base packages:
#> [1] parallel  stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#> [1] GFM_1.1.0        doSNOW_1.0.19    snow_0.4-4       iterators_1.0.13
#> [5] foreach_1.5.1   
#> 
#> loaded via a namespace (and not attached):
#>  [1] codetools_0.2-18 digest_0.6.28    MASS_7.3-53.1    R6_2.5.1        
#>  [5] jsonlite_1.7.2   magrittr_2.0.1   evaluate_0.14    rlang_0.4.11    
#>  [9] stringi_1.7.5    jquerylib_0.1.4  bslib_0.3.1      rmarkdown_2.7   
#> [13] tools_4.0.3      stringr_1.4.0    xfun_0.28        yaml_2.2.1      
#> [17] fastmap_1.1.0    compiler_4.0.3   htmltools_0.5.2  knitr_1.36      
#> [21] sass_0.4.0