Propensity score estimation

The approaches implemented in psrwe are mostly based on propensity score adjustment. Estimation of propensity scores can be done by using the function rwe_ps.

data(ex_dta)
dta_ps <- psrwe_est(ex_dta,
                     v_covs = paste("V", 1:7, sep = ""),
                     v_grp = "Group",
                     cur_grp_level = "current",
                     nstrata = 5,
                     ps_method = "logistic")
dta_ps

## This is a sing-arm study. A total of 1031 RWD subjects and 
## 200 current study subjects are used to estimate propensity 
## scores by logistic model. A total of 5 RWD subjects are 
## trimmed and excluded from the final analysis. The following 
## covariates are adjusted in the propensity score model: V1, 
## V2, V3, V4, V5, V6, V7.
## 
## The following table summarizes the number of subjects in 
## each stratum, and the distance in PS distributions 
## calculated by overlapping area:
## 
##     Stratum N_RWD N_Current  Distance
## 1 Stratum 1   729        40 0.5613996
## 2 Stratum 2   156        40 0.7208211
## 3 Stratum 3    78        40 0.8042718
## 4 Stratum 4    50        40 0.8104474
## 5 Stratum 5    13        40 0.7960132

It is extremely important to evaluate the propensity score adjustment results. In psrwe, functions are provided to visualize the balance in covariate distributions and propensity score distributions based on propensity score stratification.

plot(dta_ps, plot_type = "balance")

## Warning: The `.dots` argument of `group_by()` is deprecated as of dplyr 1.0.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.

plot(dta_ps, plot_type = "ps")

PS-integrated power prior approach for single arm studies

For single arm studies when there is one external data source, the function psrwe_powerp allows one to conduct the analysis proposed in Wang et. al. (2019). The method uses propensity score to pre-select a subset of real-world data containing patients that are similar to those in the current study in terms of covariates, and to stratify the selected patients together with those in the current study into more homogeneous strata. The power prior approach is then applied in each stratum to obtain stratum-specific posterior distributions, which are combined to complete the Bayesian inference for the parameters of interest.

ps_bor <- psrwe_borrow(dta_ps, total_borrow = 40,
                        method = "distance")
rst_pp <- psrwe_powerp(ps_bor, v_outcome = "Y_Bin",
                        outcome_type = "binary")

## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 5.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.55 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.144261 seconds (Warm-up)
## Chain 1:                0.103018 seconds (Sampling)
## Chain 1:                0.247279 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 4.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.44 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.133174 seconds (Warm-up)
## Chain 2:                0.095965 seconds (Sampling)
## Chain 2:                0.229139 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 3.4e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.34 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.133904 seconds (Warm-up)
## Chain 3:                0.091085 seconds (Sampling)
## Chain 3:                0.224989 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 3e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.135387 seconds (Warm-up)
## Chain 4:                0.094275 seconds (Sampling)
## Chain 4:                0.229662 seconds (Total)
## Chain 4:

Results can be further summarized as:

summary(rst_pp)

## $Overall
##         Type      Mean    StdErr
## Mean Control 0.3140914 0.0296843

PS-integrated composite likelihood approach for single arm studies

For single arm studies when there is one external data source, the function psrwe_cl allows one to conduct the analysis proposed in Wang et. al. (2020). In this approach, within each propensity score stratum, a composite likelihood function is specified and utilized to down-weight the information contributed by the external data source. Estimates of the stratum-specific parameters are obtained by maximizing the composite likelihood function. These stratum-specific estimates are then combined to obtain an overall population-level estimate of the parameter of interest.

rst_cl <- psrwe_compl(ps_bor, v_outcome = "Y_Bin",
                       outcome_type = "binary")
summary(rst_cl)

## $Overall
##      Type      Mean     StdErr
## 1 Control 0.3057535 0.02787001

PS-integrated composite likelihood approach for randomized studies

For randomized studies when there is one external data source that contains control subjects, the function psrwe_cl2arm allows one to conduct the analysis proposed in Chen et. al. (2020). In this approach, a propensity score-integrated composite likelihood approach is developed for augmenting the control arm of the two-arm randomized controlled trial with patients from the external data source. An example is given below.

data(ex_dta_rct)
dta_ps_rct <- psrwe_est(ex_dta_rct, v_covs = paste("V", 1:7, sep = ""),
                         v_grp = "Group", cur_grp_level = "current",
                         v_arm = "Arm", ctl_arm_level = "control")
dta_ps_rct

## This is a randomized study. A total of 1031 RWD subjects 
## and 200 current study subjects are used to estimate 
## propensity scores by logistic model. A total of 25 RWD 
## subjects are trimmed and excluded from the final analysis. 
## The following covariates are adjusted in the propensity 
## score model: V1, V2, V3, V4, V5, V6, V7.
## 
## The following table summarizes the number of subjects in 
## each stratum, and the distance in PS distributions 
## calculated by overlapping area:
## 
##     Stratum N_RWD N_RWD_CTL N_Current N_Cur_CTL N_Cur_TRT  Distance
## 1 Stratum 1   703       703        41        20        21 0.7212720
## 2 Stratum 2   120       120        34        20        14 0.7197000
## 3 Stratum 3    93        93        38        20        18 0.7673496
## 4 Stratum 4    72        72        43        20        23 0.6977744
## 5 Stratum 5    18        18        44        20        24 0.6181907

ps_bor_rct <- psrwe_borrow(dta_ps_rct, total_borrow = 30,
                            method = "distance")
ps_bor_rct

## A total of 30 subjects will be borrowed from the RWD. The 
## number 30 is split proportional to the distance in PS 
## distributions in each stratum. The following table 
## summarizes the number of subjects to be borrowed and the 
## weight parameter in each stratum:
## 
##     Stratum N_RWD N_RWD_CTL N_RWD_TRT N_Current N_Cur_CTL N_Cur_TRT  Distance
## 1 Stratum 1   703       703         0        41        20        21 0.7212720
## 2 Stratum 2   120       120         0        34        20        14 0.7197000
## 3 Stratum 3    93        93         0        38        20        18 0.7673496
## 4 Stratum 4    72        72         0        43        20        23 0.6977744
## 5 Stratum 5    18        18         0        44        20        24 0.6181907
##   Proportion N_Borrow      Alpha
## 1  0.2046576 6.139728 0.00873361
## 2  0.2042115 6.126346 0.05105288
## 3  0.2177319 6.531957 0.07023609
## 4  0.1979902 5.939707 0.08249593
## 5  0.1754087 5.262262 0.29234791

rst_cl_rct <- psrwe_compl(ps_bor_rct, v_outcome = "Y_Con",
                           outcome_type = "continuous")

rst_cl_rct$Effect

## $Stratum_Estimate
##        Mean   StdErr
## 1 13.826290 7.352086
## 2 -7.758183 5.926175
## 3 15.811277 7.817626
## 4 15.536262 6.249770
## 5 12.630633 7.373947
## 
## $Overall_Estimate
##       Mean   StdErr
## 1 10.63868 3.151204

psrwe: Propensity Score-Integrated Methods for Incorporating Real-World Evidence in Clinical Studies

Chenguang Wang

2022-02-28

Introduction

Propensity score estimation

PS-integrated power prior approach for single arm studies

PS-integrated composite likelihood approach for single arm studies

PS-integrated composite likelihood approach for randomized studies

Reference