This vignette provides additional details of the analysis model that is fitted as well as a demonstration of how to use the package to fit said models to an existing dataset. For examples of how to fit the analysis model to the simulated datasets generated by the package itself, please see the user-guide vignette.
The “Analysis Model” section of this vignette is a modified excerpt from Lu, Y. et al. (2021).
The package uses Markov chain Monte Carlo (MCMC) through JAGS to obtain the posterior distribution for the hazard ratio between treatment and control arms. The package relies on the Bayesian commensurate prior approach (Lewis, et al., 2019). The probability density function for the survival time of patient \(i\) following a Weibull model is as below:
\[ t_i \sim Weibull( \nu, \lambda_i) \]
for \(i = 1, 2, \dots,n\), where:
The probability density function is:
\[ f(t_i, x_i) = \nu\lambda_i t^{\nu-1} e^{-\lambda_it^\nu} \]
with the rate taking the format:
\[ \lambda_i = exp(\alpha_i + \beta_{trt} \times I_{trt} + \beta^T \times X_{i}) \]
Where:
A hierarchical model is defined in which the log hazard rate for the concurrent controls is centered on the external controls, i.e. \(\alpha_{cc} | \alpha_{ec} \sim N(\alpha_{ec}, \tau)\). The hyperprior \(\tau\), also referred to as the commensurability parameter, determines the extent of borrowing (see Lewis, et al., 2019).
Four borrowing methods have been implemented:
For methods (1) and (2), a conventional Bayesian model without dynamic borrowing is deployed with all relevant parameters assigned a non-informative prior. For method (1) no external control arm information is used and as such \(\alpha_{cc} = \alpha \sim N(0, 0.0001)\) For method (2) the external control arm is treated as part of the concurrent control and as such \(\alpha_{cc} = \alpha_{ec} = \alpha \sim N(0, 0.0001)\).
In comparison, methods (3) and (4) have the commensurability hyper-parameter \(\tau\), which acts as the precision variable assessing the similarity between the concurrent and external control groups. The prior of \(tau\) follows a Gamma distribution (default: shape = 1, rate = 0.001) and a half-Cauchy distribution (default: location = 0, scale = 0.2), respectively, in methods 3 and 4. Users can update the default values for the parameter if they wish to customize the hyper-prior.
After users state the prior choices and the simulated data is ready for use, the Markov chain Monte Carlo (MCMC) algorithm is deployed to obtain samples from the posterior distribution.
To demonstrate how to use the package to fit the analysis model we first generate a dataset where the time to event is drawn from the following distribution:
\[ t_i \sim Weibull(0.95, \lambda_i) \] Where:
\[ \lambda_i = \frac{1}{150}I_{ec} \times \frac{1}{200}I_{cc} \times \frac{1}{400}I_{trt} \times exp(0.2\times x_{age} - 0.1\times I_{female}) \]
Patients are then censored by drawing a censoring time from the exponential distribution with a rate parameter of \(\lambda = 1/400\)
suppressPackageStartupMessages({
library(psborrow)
library(dplyr)
library(flexsurv)
})
set.seed(401)
<- 1000
n
<- c("CC", "TRT", "EC")
grp_lvl <- c("Male", "Female")
sex_lvl
<- c(
log_hr_grp "TRT" = log(1 / 400),
"CC" = log(1 / 200),
"EC" = log(1 / 150)
)
<- c(
log_hr_sex "Male" = 0,
"Female" = -0.1
)
<- 0.2
log_hr_age
<- 0.95
Shape
<- tibble(
dat pt = 1:n,
grp = factor(
sample(grp_lvl, size = n, replace = TRUE, prob = c(0.3, 0.3, 0.4)),
levels = grp_lvl
),sex = factor(
sample(sex_lvl, size = n, replace = TRUE, prob = c(0.4, 0.6)),
levels = sex_lvl
),age = rnorm(n, mean = 0, sd = 1),
lambda = exp(
* age +
log_hr_age as.character(sex)] +
log_hr_sex[as.character(grp)]
log_hr_grp[
),time = rweibullPH(n, scale = lambda, shape = Shape),
centime = rexp(n, 1 / 400)
%>%
) mutate(event = ifelse(time <= centime, 1, 0)) %>%
mutate(time = ifelse(time <= centime, time, centime)) %>%
select(pt, grp, age, sex, time, event)
dat#> # A tibble: 1,000 × 6
#> pt grp age sex time event
#> <int> <fct> <dbl> <fct> <dbl> <dbl>
#> 1 1 TRT 1.12 Female 64.0 1
#> 2 2 CC -0.577 Female 14.3 0
#> 3 3 CC -0.281 Female 190. 0
#> 4 4 CC -0.371 Male 348. 0
#> 5 5 CC 1.29 Female 43.6 1
#> 6 6 EC -0.840 Male 27.8 0
#> 7 7 EC 0.901 Female 706. 1
#> 8 8 TRT 0.906 Female 105. 1
#> 9 9 CC 1.43 Male 57.9 1
#> 10 10 TRT 2.75 Female 85.9 1
#> # … with 990 more rows
To conduct an analysis we need to use the apply_mcmc()
function. However to use this function we first need to convert our data
into the required format. Full details of the required format can be
found in help(apply_mcmc)
however an example of this is
shown below:
<- dat %>%
dat2 mutate(trt = ifelse(grp == "TRT", 1, 0)) %>%
mutate(ext = ifelse(grp == "EC", 1, 0)) %>%
mutate(cnsr = 1 - event) %>%
select(pt, trt, ext, time, cnsr, sex, age)
dat2#> # A tibble: 1,000 × 7
#> pt trt ext time cnsr sex age
#> <int> <dbl> <dbl> <dbl> <dbl> <fct> <dbl>
#> 1 1 1 0 64.0 0 Female 1.12
#> 2 2 0 0 14.3 1 Female -0.577
#> 3 3 0 0 190. 1 Female -0.281
#> 4 4 0 0 348. 1 Male -0.371
#> 5 5 0 0 43.6 0 Female 1.29
#> 6 6 0 1 27.8 1 Male -0.840
#> 7 7 0 1 706. 0 Female 0.901
#> 8 8 1 0 105. 0 Female 0.906
#> 9 9 0 0 57.9 0 Male 1.43
#> 10 10 1 0 85.9 0 Female 2.75
#> # … with 990 more rows
We can now pass our data to apply_mcmc()
to produce our
posterior samples. Prior distributions are specified using the
set_prior()
function, details for which can be found in
help(set_prior)
as well as the user-guide vignette.
Covariates for the analysis model need to be specified by a 1-sided
formula provided to the formula_cov
argument. Note that the
treatment indicator is automatically added to the model and should not
be specified in the covariate formula; if you require treatment
interactions these should be specified explicitly via
trt:cov
rather than by trt*cov
. The intercept
term must always be included in this formula. You are advised to leave
the pred
parameter as either “all” or “ps” depending on
your requirements. If you do wish to restrict the covariates included in
the model then this should be done by omitting them from the
formula_cov
rather than the pred
argument in
set_prior()
.
<- apply_mcmc(
postsamp dt = dat2,
formula_cov = ~ 1 + age + sex,
priorObj = set_prior(pred = "all", prior = "gamma", r0 = 0.85, alpha = c(0, 0)),
n.chains = 1,
n.adapt = 100,
n.burn = 100,
n.iter = 200,
seed = 47
)#> Input all is found in pred. Any other input is disregarded.
#>
|
| | 0%
|
|* | 2%
|
|** | 4%
|
|*** | 6%
|
|**** | 8%
|
|***** | 10%
|
|****** | 12%
|
|******* | 14%
|
|******** | 16%
|
|********* | 18%
|
|********** | 20%
|
|*********** | 22%
|
|************ | 24%
|
|************* | 26%
|
|************** | 28%
|
|*************** | 30%
|
|**************** | 32%
|
|***************** | 34%
|
|****************** | 36%
|
|******************* | 38%
|
|******************** | 40%
|
|********************* | 42%
|
|********************** | 44%
|
|*********************** | 46%
|
|************************ | 48%
|
|************************* | 50%
|
|************************** | 52%
|
|*************************** | 54%
|
|**************************** | 56%
|
|***************************** | 58%
|
|****************************** | 60%
|
|******************************* | 62%
|
|******************************** | 64%
|
|********************************* | 66%
|
|********************************** | 68%
|
|*********************************** | 70%
|
|************************************ | 72%
|
|************************************* | 74%
|
|************************************** | 76%
|
|*************************************** | 78%
|
|**************************************** | 80%
|
|***************************************** | 82%
|
|****************************************** | 84%
|
|******************************************* | 86%
|
|******************************************** | 88%
|
|********************************************* | 90%
|
|********************************************** | 92%
|
|*********************************************** | 94%
|
|************************************************ | 96%
|
|************************************************* | 98%
|
|**************************************************| 100%
From here we can extract our posterior samples as follows:
head(extract_samples(postsamp))
#> HR_cc_hc HR_trt_cc alpha[1] alpha[2] beta_trt beta_age
#> [1,] 0.9011503 0.3671814 -4.949371 -4.845288 -1.0018994 0.2745032
#> [2,] 0.9738321 0.4697018 -4.876703 -4.850186 -0.7556573 0.2539210
#> [3,] 0.9455914 0.3991505 -4.942033 -4.886088 -0.9184167 0.3249545
#> [4,] 0.9798274 0.4088358 -4.897589 -4.877210 -0.8944417 0.2851334
#> [5,] 1.0144295 0.3605800 -4.859848 -4.874174 -1.0200414 0.2935454
#> [6,] 1.0137068 0.3633342 -4.903234 -4.916847 -1.0124321 0.2535249
#> beta_sexFemale r0 tau
#> [1,] -0.25909116 0.9201647 360.0789
#> [2,] -0.25321051 0.9204208 283.2014
#> [3,] -0.22758668 0.9211114 667.9504
#> [4,] -0.16777557 0.9215409 2315.9094
#> [5,] -0.18770880 0.9191615 2675.4973
#> [6,] -0.09992967 0.9020110 783.1036
For reference the returned parameters are:
HR_cc_hc
- The hazard ratio between the concurrent
control arm and the historical control arm. This can be be thought of as
the ratio of the scale parameter between the baseline trial distribution
and the baseline external control distribution. This is equivalent to
exp(alpha[1] - alpha[2])
HR_trt_cc
- The hazard ratio between the treatment
arm and the concurrent control arm. This is equivalent to
exp(beta_trt)
alpha[1]
- The shape parameter for the trial’s
baseline distribution
alpha[2]
- The shape parameter for the historical
control’s baseline distribution
beta_trt
- The log-hazard ratio for the treatment
effect. This is equivalent to log(HR_trt_cc)
beta_age
- The log-hazard ratio for the covariate
age
beta_sexFemale
- The log-hazard ratio for the
covariate sexFemale
r0
- The scale parameter for the baseline
distribution of both the trial and the historical control
tau/sigma
- The precision/variance for
alpha[1]
i.e. controls how much information is borrowed
from the historical control arm
If prior = "no_ext"
or prior = "full_ext"
then alpha[1]
and alpha[2]
will be replaced
with a single parameter alpha
representing the shape
parameter for the single baseline distribution for the trail.
Additionally general summary statistics of our samples can be
generated by the summary()
function:
summary(postsamp)
#> parameter mean sd lci uci
#> 1 HR_cc_hc 0.9901932 0.03668913 0.9120554 1.06368087
#> 2 HR_trt_cc 0.3857669 0.03881582 0.3230326 0.46980137
#> 3 alpha[1] -4.9414910 0.09106815 -5.0775808 -4.76802672
#> 4 alpha[2] -4.9309512 0.08729994 -5.0899134 -4.75289074
#> 5 beta_trt -0.9573926 0.09823656 -1.1300021 -0.75544616
#> 6 beta_age 0.2781143 0.04180654 0.2019639 0.35759578
#> 7 beta_sexFemale -0.1390660 0.07769175 -0.2664242 0.02923811
#> 8 r0 0.9239697 0.01768511 0.8921297 0.95277956
#> 9 tau 1002.0669283 999.10913348 49.5932394 3564.40952449
Lu, Y., Lin, A., Pang, H., Zhu, J. (2021). PSBORROW: BAYESIAN DYNAMIC BORROWING R TOOL FOR COMPLEX INNOVATIVE TRIAL DESIGNS. ASA Biopharmaceutical Report, 28 (3), 11-19.
Lewis, C. J., Sarkar, S., Zhu, J., & Carlin, B. P. (2019). Borrowing from historical control data in cancer drug development: a cautionary tale and practical guidelines. Statistics in biopharmaceutical research, 11(1), 67-78.