Visualization of clinical data

Laure Cougnaud, Michela Pasetto

February 23, 2022

This vignette focuses on the visualizations available in the clinDataReview package.

We will use example data sets from the clinUtils package.

If you have doubts on the data format, please check first the vignette on data preprocessing available at: here.

If everything is clear on that side, let’s get started!

Please note that the patient profiles and interactive visualizations are only displayed in the vignette if Pandoc is available.

library(clinDataReview)
library(pander)
library(plotly)
library(clinUtils)

data(dataADaMCDISCP01)
labelVars <- attr(dataADaMCDISCP01, "labelVars")

varsLB <- c(
    "PARAM", "PARAMCD", "USUBJID", "TRTP", 
    "ADY", "VISITNUM", "VISIT", "LBSTRESN"
)
dataLB <- dataADaMCDISCP01$ADLBC[, varsLB]

varsAE <- c("USUBJID", "AESOC", "AEDECOD", "ASTDY", "AENDY", "AESEV")
dataAE <- dataADaMCDISCP01$ADAE[, varsAE]

varsDM <- c("RFSTDTC", "USUBJID")
dataDM <- dataADaMCDISCP01$ADSL[, varsDM]

1 Patient profiles

The interactive visualizations of the clinical data package include functionalities to link a plot to patient-specific report, e.g. patient profiles created with the patientProfilesVis package.

Such patient profiles can be created via a config file, with a dedicated template report available in the clinDataReview package.

A simple patient profile report for each subject in the example dataset is created below.

Please note that the patient profiles are created and included in the interactive visualizations only during an interactive session (via interactive()) .

# create a directory to store the patient profiles:
patientProfilesDir <- "patientProfiles"
dir.create(patientProfilesDir)

# get examples of parameters for the report
configDir <- system.file("skeleton", "config", package = "clinDataReview")
params <- getParamsFromConfig(
    configDir = configDir, 
    configFile = "config-patientProfiles.yml"
)
# create patient profile with only one panel for the demo
params$patientProfilesParams <- params$patientProfilesParams[1]
# use dataset from the clinUtils package
params$pathDataFolder <- system.file("extdata", "cdiscpilot01", "SDTM", package = "clinUtils")
# store patient profile in this folder:
params$patientProfilePath <- patientProfilesDir

# create patient profiles
pathTemplate <- clinDataReview::getPathTemplate(params$template)
file.copy(from = pathTemplate, to = ".")
report <- rmarkdown::render(
    input = basename(pathTemplate), 
    envir = new.env()
)
unlink(basename(pathTemplate))
unlink(basename(report))

Please refer to the vignette about reporting for more details on how to set up a config file and use template reports available in the package.

You can directly skip to reporting vignette, which is available here or run in your console the command below.

vignette("clinDataReview-reporting", "clinDataReview")

2 Data visualization

All the visualizations available in the package are interactive.

2.1 Visualization of individual profiles

Visualization of individual profiles is available via the function scatterplotClinData.

2.1.1 Explore the visualization data

To facilitate the exploration of the data, the underlying data behind each visualization can be included as a table as well below the plot by setting the parameter table to TRUE.

Please note that this functionality is not demonstrated in this document to ensure a lightweight vignette in the package.

2.1.3 Spaghetti plot of time profile

2.1.4 Scatterplot

2.1.5 eDish plot

2.1.6 Visualization of time-intervals

Time-intervals are displayed with the timeProfileIntervalPlot function:

By default, empty intervals are represented if the start/end time variables are missing. Missing start/end time can be imputed, or different symbols can be used to represent such cases:

2.2 Visualization of summary statistics

Summary statistics can also be visualized with the package, via different types of visualizations: sunburst, treemap and barplot.

These functions take as input a table of summary statistics, especially counts. Such table can e.g. computed with the inTextSummaryTable R package (see corresponding package vignette for more information).

2.2.2 Categorical variables

2.2.2.1 Compute count statistics

In this example, counts of adverse events are extracted for each Primary System Organ Class and Dictionary-Derived Term.

Besides the counts of the number of subjects, the paths to the patient profile report for each subgroup are extracted and combined.

library(inTextSummaryTable)

# total counts: Safety Analysis Set (patients with start date for the first treatment)
dataTotal <- subset(dataDM, RFSTDTC != "")

## patient profiles report

if(interactive()){

    # add path in data
    
    dataAE$patientProfilePath <- paste0(
        "patientProfiles/subjectProfile-", 
        sub("/", "-", dataAE$USUBJID), ".pdf"
    )

    # add link in data (for attached table)
    dataAE$patientProfileLink <- with(dataAE,
        paste0(
            '<a href="', patientProfilePath, 
            '" target="_blank">', USUBJID, '</a>'
        )
    )

    # Specify extra summarizations besides the standard stats
    # When the data is summarized,
    # the patient profile path are summarized
    # as well across patients
    # (the paths should be collapsed with: ', ')
    statsExtraPP <- list(
        statPatientProfilePath = function(data) 
          toString(sort(unique(data$patientProfilePath))),
        statPatientProfileLink = function(data)
          toString(sort(unique(data$patientProfileLink)))
    )
    
}

# get counts (records, subjects, % subjects) + stats with subjects profiles path
statsPP <- c(
    getStats(type = "count"),
    if(interactive())
        list(
            patientProfilePath = quote(statPatientProfilePath),
            patientProfileLink = quote(statPatientProfileLink)
        )
)

dataAE$AESEV <- factor(
    dataAE$AESEV,
    levels = c("MILD", "MODERATE", "SEVERE")
)
dataAE$AESEVN <- as.numeric(dataAE$AESEV)

# compute adverse event table
tableAE <- computeSummaryStatisticsTable(
    
    data = dataAE,
    rowVar = c("AESOC", "AEDECOD"),
    dataTotal = dataTotal,
    labelVars = labelVars,
    
    # The total across the variable used for the nodes
    # should be specified
    rowVarTotalInclude = c("AESOC", "AEDECOD"),
    
    rowOrder = "total",
    
    # statistics of interest
    # include columns with patients
    stats = statsPP, 
    # add extra 'statistic': concatenate subject IDs
    statsExtra = if(interactive())  statsExtraPP

)
pander(head(tableAE),
    caption = paste("Extract of the Adverse Event summary table",
        "used for the sunburst and barplot visualization"
    )
)
Extract of the Adverse Event summary table used for the sunburst and barplot visualization (continued below)
AESOC AEDECOD isTotal statN
CARDIAC DISORDERS MYOCARDIAL INFARCTION FALSE 1
GASTROINTESTINAL DISORDERS DYSPEPSIA FALSE 1
GASTROINTESTINAL DISORDERS NAUSEA FALSE 2
GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS APPLICATION SITE DERMATITIS FALSE 1
GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS APPLICATION SITE ERYTHEMA FALSE 3
GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS APPLICATION SITE IRRITATION FALSE 2
statm statPercTotalN statPercN n % m
1 7 14.29 1 14.3 1
1 7 14.29 1 14.3 1
7 7 28.57 2 28.6 7
2 7 14.29 1 14.3 2
3 7 42.86 3 42.9 3
4 7 28.57 2 28.6 4

2.2.2.2 Sunburst

The sunburstClinData function visualizes the counts of hierarchical data in nested circles.

The different groups are visualized from the biggest class (root node) in the center of the visualization to the smallest sub-groups (leaves) on the outside of the circles.

The size of the different segments is relative the respective counts.

2.2.2.3 Treemap

A treemap visualizes the counts of the hierarchical data in nested rectangles. The area of each rectangle is proportional to the counts of the respective group.

Note, that a treemap can also be colored accordingly to a meaningful variable. For instance, if we show adverse events, we might color the plot by severity. This can be achieved with the colorVar parameter.

2.2.2.4 Barplot

A barplot visualizes the counts for one single variable in a specific order.

2.2.3 Continuous variable

2.2.3.2 Plot error bars/confidence intervals

2.2.3.3 Boxplot

A boxplot visualizes the distribution of a continuous variable of interest versus specific categorical variables.

This visualization doesn’t rely on pre-computed statistics, so the continuous variable of interest is directly passed to the functionality.

2.3 Multiple visualizations in a loop

To include multiple clinical data visualizations (with or without attached table) in a loop (in the same Rmarkdown chunk), the list of visualizations should be passed to the knitPrintListObjects function of the clinUtils package.

2.3.0.1 Potassium (mmol/L)

2.3.0.2 Sodium (mmol/L)

3 Palettes

3.1 Set palette for the entire session

Palette for the colors and shapes associated with specific variables can be set for all clinical data visualizations at once by setting the clinDataReview.colors and clinDataReview.shapes options at the start of the R session.

Please see the clinUtils package for the default colors and shapes.

## function (n, alpha = 1, begin = 0, end = 1, direction = 1, option = "D")
##  int [1:24] 21 22 23 24 25 0 1 2 3 4 ...

The palettes can be set for all visualizations, e.g. at the start of the R session, with:

In case the palette contains less elements than available in the data, these are replicated.

Palettes are reset to the default patient profiles palettes at the start of a new R session, or by setting:

4 Appendix

4.1 Session info

R version 4.1.2 (2021-11-01)

Platform: x86_64-pc-linux-gnu (64-bit)

locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=C, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C

attached base packages: stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: plyr(v.1.8.6), inTextSummaryTable(v.3.1.1), reshape2(v.1.4.4), plotly(v.4.10.0), ggplot2(v.3.3.5), clinUtils(v.0.1.1), clinDataReview(v.1.2.2), pander(v.0.6.4) and knitr(v.1.37)

loaded via a namespace (and not attached): ggrepel(v.0.9.1), Rcpp(v.1.0.8), tidyr(v.1.2.0), digest(v.0.6.29), utf8(v.1.2.2), R6(v.2.5.1), evaluate(v.0.15), httr(v.1.4.2), pillar(v.1.7.0), gdtools(v.0.2.4), rlang(v.1.0.1), lazyeval(v.0.2.2), uuid(v.1.0-3), data.table(v.1.14.2), DT(v.0.20), flextable(v.0.6.10), rmarkdown(v.2.11), labeling(v.0.4.2), stringr(v.1.4.0), htmlwidgets(v.1.5.4), munsell(v.0.5.0), compiler(v.4.1.2), xfun(v.0.29), pkgconfig(v.2.0.3), systemfonts(v.1.0.4), base64enc(v.0.1-3), htmltools(v.0.5.2), tidyselect(v.1.1.1), tibble(v.3.1.6), bookdown(v.0.24), jsonvalidate(v.1.3.2), fansi(v.1.0.2), viridisLite(v.0.4.0), crayon(v.1.5.0), dplyr(v.1.0.8), withr(v.2.4.3), grid(v.4.1.2), jsonlite(v.1.7.3), gtable(v.0.3.0), lifecycle(v.1.0.1), magrittr(v.2.0.2), scales(v.1.1.1), zip(v.2.2.0), cli(v.3.2.0), stringi(v.1.7.6), farver(v.2.1.0), xml2(v.1.3.3), ellipsis(v.0.3.2), generics(v.0.1.2), vctrs(v.0.3.8), cowplot(v.1.1.1), tools(v.4.1.2), forcats(v.0.5.1), glue(v.1.6.1), officer(v.0.4.1), purrr(v.0.3.4), hms(v.1.1.1), crosstalk(v.1.2.0), fastmap(v.1.1.0), yaml(v.2.3.5), colorspace(v.2.0-3) and haven(v.2.4.3)