1 Introduction

This vignette explains how to perform scale linking with the PROsetta package. By way of illustration, we replicate the linking of the Center for Epidemiologic Studies Depression Scale (CES-D) to the PROMIS Depression metric as described in Choi, Schalet, Cook, and Cella (2014).

2 Load datasets

First step is to load the input datasets comprised of three tables with loadData(). The PROMIS Depression – CES-D linking data are included in the PROsetta package directory under the folder labeled data-raw.

fp <- system.file("data-raw", package = "PROsetta")
d <- loadData(
  response  = "dat_DeCESD_v2.csv",
  itemmap   = "imap_DeCESD.csv",
  anchor    = "anchor_DeCESD.csv",
  input_dir = fp)

response: Contains item response data from both instruments. You can supply a .csv filename or a data frame. In this example, we supply a .csv filename dat_DeCESD_v2.csv.
itemmap: Specifies which items belong to which instruments. Can be a .csv filename or a data frame.
anchor: Contains tem parameters for anchor items (e.g., PROMIS Depression). Can be a .csv filename or a data frame.
input_dir: (Optional) The path of the directory to look for the input .csv files.

2.1 Response data

The response data contains individual item responses on both instruments (i.e., 28 PROMIS Depression items followed by 20 CES-D items). The data table should include the following columns.

prosettaid: The person ID of the respondents (N = 747). This column does not have to be named prosettaid but should not conflict with other data tables (item map and anchor).
Other columns should include the item response fields with their unique item IDs as column names. The item names should match the item_id column in both the item map and anchor files.

Run the following code, for example, to open the response data in edit mode.

file.edit(system.file("data-raw", "dat_DeCESD_v2.csv", package = "PROsetta"))

2.2 Item map data

The item map data requires the following columns.

item_id: Contains the unique ID of the items. The name of this column does not have to be item_id but should be consistent with the item ID column in the anchor table. The IDs in this column should match the column names in the response data.
instrument: Numerals (1 or 2) indicating to which of the two instruments the items belong (e.g., 1 = PROMIS Depression; 2 = CES-D)
item_order: The sequential position of the items in the combined table (e.g., 1, 2, 3, …, 28, …, 48)
item_name: Secondary labels for the items
ncat: The number of response categories by item

Run the following code to open the item map data in edit mode.

file.edit(system.file("data-raw", "imap_DeCESD.csv", package = "PROsetta"))

2.3 Anchor data

The anchor data contains the item parameters for the anchor scale (e.g., PROMIS Depression) and requires the following columns.

item_order: The sequential position of the items in the anchor scale (e.g., 1, 2, 3, …, 28)
item_id: The unique ID of the anchor items. The name of this column does not have to be item_id but should be consistent with the item ID column in the item map table The IDs in this column should refer to the specific column names in the response data.
a: The slope parameter value for each anchor item
cb1, cb2, …: The category boundary parameter values for each anchor item
ncat: The number of response categories for each anchor item

Run the following code to open the anchor data in edit mode.

file.edit(system.file("data-raw", "anchor_DeCESD.csv", package = "PROsetta"))

3 Descriptive analysis

3.1 Basic descriptive statistics

The frequency distribution of each item in the response data is obtained by runFrequency().

freq_table <- runFrequency(d)
head(freq_table)

##           1   2   3  4  5
## EDDEP04 526 112  66 29 14
## EDDEP05 488 118  91 37 12
## EDDEP06 502 119  85 30 10
## EDDEP07 420 155 107 49 16
## EDDEP09 492 132  89 25  9
## EDDEP14 445 150 101 37 14

The frequency distribution of the summed scores for the combined scale can be plotted as a histogram with plot(). The required argument is a PROsetta_data object created with loadData(). The optional scale argument specifies for which scale the summed score should be created. Setting scale = 'combined' plots the summed score distribution for the combined scale.

plot(d, scale = "combined", title = "Combined scale")

The user can also generate the summed score distribution for the first or second scale by specifying scale = 1 or scale = 2.

plot(d, scale = 1, title = "Scale 1") # not run
plot(d, scale = 2, title = "Scale 2") # not run

Basic descriptive statistics are obtained for each item by runDescriptive().

desc_table <- runDescriptive(d)
head(desc_table)

##           n mean   sd median trimmed mad min max range skew kurtosis   se
## EDDEP04 747 1.52 0.94      1    1.30   0   1   5     4 1.91     3.01 0.03
## EDDEP05 746 1.62 0.99      1    1.42   0   1   5     4 1.54     1.50 0.04
## EDDEP06 746 1.56 0.94      1    1.37   0   1   5     4 1.66     2.01 0.03
## EDDEP07 747 1.78 1.05      1    1.59   0   1   5     4 1.23     0.59 0.04
## EDDEP09 747 1.56 0.91      1    1.38   0   1   5     4 1.62     1.98 0.03
## EDDEP14 747 1.69 1.00      1    1.51   0   1   5     4 1.38     1.12 0.04

3.2 Classical reliability analysis

Classical reliability statistics can be obtained by runClassical(). By default, the analysis is performed for the combined scale.

classical_table <- runClassical(d)
summary(classical_table$alpha$combined)

## 
## Reliability analysis   
##  raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.98      0.98    0.99      0.53  54 9e-04  1.7 0.69     0.54

The user can set scalewise = TRUE to request an analysis for each scale separately in addition to the combined scale.

classical_table <- runClassical(d, scalewise = TRUE)
classical_table$alpha$combined # alpha values for combined scale
classical_table$alpha$`1`      # alpha values for each scale, created when scalewise = TRUE
classical_table$alpha$`2`      # alpha values for each scale, created when scalewise = TRUE

Specifying omega = TRUE returns the McDonald’s $\omega$ coefficients as well.

classical_table <- runClassical(d, scalewise = TRUE, omega = TRUE)
classical_table$omega$combined # omega values for combined scale
classical_table$omega$`1`      # omega values for each scale, created when scalewise = TRUE
classical_table$omega$`2`      # omega values for each scale, created when scalewise = TRUE

Additional arguments can be supplied to runClassical() to pass onto psych::omega().

classical_table <- runClassical(d, scalewise = TRUE, omega = TRUE, nfactors = 5) # not run

3.3 Dimensionality analysis

Dimensionality analysis is performed with CFA by runCFA(). Setting scalewise = TRUE performs the dimensionality analysis for each scale separately in addition to the combined scale.

out_cfa <- runCFA(d, scalewise = TRUE)

runCFA() calls for lavaan::cfa() internally and can pass additional arguments onto it.

out_cfa <- runCFA(d, scalewise = TRUE, std.lv = TRUE) # not run

The CFA result for the combined scale is stored in the combined slot, and if scalewise = TRUE, the analysis for each scale is also stored in each numbered slot.

out_cfa$combined

## lavaan 0.6-9 ended normally after 23 iterations
## 
##   Estimator                                       DWLS
##   Optimization method                           NLMINB
##   Number of model parameters                       220
##                                                       
##                                                   Used       Total
##   Number of observations                           731         747
##                                                                   
## Model Test User Model:
##                                               Standard      Robust
##   Test Statistic                              4227.611    4700.780
##   Degrees of freedom                              1080        1080
##   P-value (Chi-square)                           0.000       0.000
##   Scaling correction factor                                  1.046
##   Shift parameter                                          657.455
##        simple second-order correction

out_cfa$`1`

## lavaan 0.6-9 ended normally after 16 iterations
## 
##   Estimator                                       DWLS
##   Optimization method                           NLMINB
##   Number of model parameters                       140
##                                                       
##                                                   Used       Total
##   Number of observations                           738         747
##                                                                   
## Model Test User Model:
##                                               Standard      Robust
##   Test Statistic                               863.527    1434.277
##   Degrees of freedom                               350         350
##   P-value (Chi-square)                           0.000       0.000
##   Scaling correction factor                                  0.678
##   Shift parameter                                          160.257
##        simple second-order correction

out_cfa$`2`

## lavaan 0.6-9 ended normally after 20 iterations
## 
##   Estimator                                       DWLS
##   Optimization method                           NLMINB
##   Number of model parameters                        80
##                                                       
##                                                   Used       Total
##   Number of observations                           740         747
##                                                                   
## Model Test User Model:
##                                               Standard      Robust
##   Test Statistic                              1106.148    1431.797
##   Degrees of freedom                               170         170
##   P-value (Chi-square)                           0.000       0.000
##   Scaling correction factor                                  0.812
##   Shift parameter                                           69.205
##        simple second-order correction

CFA fit indices can be obtained by using summary() from the lavaan package. For the combined scale:

lavaan::summary(out_cfa$combined, fit.measures = TRUE, standardized = TRUE, estimates = FALSE)

## lavaan 0.6-9 ended normally after 23 iterations
## 
##   Estimator                                       DWLS
##   Optimization method                           NLMINB
##   Number of model parameters                       220
##                                                       
##                                                   Used       Total
##   Number of observations                           731         747
##                                                                   
## Model Test User Model:
##                                               Standard      Robust
##   Test Statistic                              4227.611    4700.780
##   Degrees of freedom                              1080        1080
##   P-value (Chi-square)                           0.000       0.000
##   Scaling correction factor                                  1.046
##   Shift parameter                                          657.455
##        simple second-order correction                             
## 
## Model Test Baseline Model:
## 
##   Test statistic                            793198.133   91564.632
##   Degrees of freedom                              1128        1128
##   P-value                                        0.000       0.000
##   Scaling correction factor                                  8.758
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.996       0.960
##   Tucker-Lewis Index (TLI)                       0.996       0.958
##                                                                   
##   Robust Comparative Fit Index (CFI)                            NA
##   Robust Tucker-Lewis Index (TLI)                               NA
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.063       0.068
##   90 Percent confidence interval - lower         0.061       0.066
##   90 Percent confidence interval - upper         0.065       0.070
##   P-value RMSEA <= 0.05                          0.000       0.000
##                                                                   
##   Robust RMSEA                                                  NA
##   90 Percent confidence interval - lower                        NA
##   90 Percent confidence interval - upper                        NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.051       0.051

and also for each scale separately:

lavaan::summary(out_cfa$`1`, fit.measures = TRUE, standardized = TRUE, estimates = FALSE) # not run
lavaan::summary(out_cfa$`2`, fit.measures = TRUE, standardized = TRUE, estimates = FALSE) # not run

3.4 Item parameter calibration

runCalibration() performs IRT calibration without anchoring. runCalibration() calls mirt::mirt() internally. Additional arguments can be passed onto mirt, e.g., to increase the number of EM cycles to 1000, as follows:

out_calib <- runCalibration(d, technical = list(NCYCLES = 1000))

In case of nonconvergence, runCalibration() explicitly raises an error and does not return its results.

out_calib <- runCalibration(d, technical = list(NCYCLES = 10))

## Error in runCalibration(d, technical = list(NCYCLES = 10)): calibration did not converge: increase iteration limit by adjusting the 'technical' argument, e.g., technical = list(NCYCLES = 510)

Also, specify fixedpar = TRUE to perform fixed parameter calibration using the anchor data.

out_calib <- runCalibration(d, fixedpar = TRUE)

The output object from runCalibration() can be used to generate additional output with functions from the mirt package.

Use coef() to extract item parameters:

mirt::coef(out_calib, IRTpars = TRUE, simplify = TRUE)

## $items
##             a     b1    b2    b3    b4
## EDDEP04 4.261  0.401 0.976 1.696 2.444
## EDDEP05 3.932  0.305 0.913 1.593 2.412
## EDDEP06 4.145  0.350 0.915 1.678 2.471
## EDDEP07 2.802  0.148 0.772 1.603 2.538
## EDDEP09 3.657  0.312 0.982 1.782 2.571
## EDDEP14 2.333  0.186 0.947 1.729 2.633
## EDDEP17 3.274 -0.498 0.406 1.413 2.375
## EDDEP19 3.241  0.460 1.034 1.834 2.515
## EDDEP21 2.736  0.072 0.810 1.803 2.673
## EDDEP22 3.970  0.204 0.795 1.649 2.295
## EDDEP23 2.564 -0.038 0.693 1.653 2.584
## EDDEP26 3.093 -0.358 0.412 1.404 2.224
## EDDEP27 2.920  0.204 0.891 1.655 2.528
## EDDEP28 2.588 -0.079 0.633 1.477 2.328
## EDDEP29 4.343 -0.117 0.598 1.428 2.272
## EDDEP30 2.613 -0.023 0.868 1.864 2.826
## EDDEP31 3.183 -0.261 0.397 1.305 2.134
## EDDEP35 3.106  0.044 0.722 1.639 2.471
## EDDEP36 3.483 -0.536 0.348 1.347 2.355
## EDDEP39 3.131  0.918 1.481 2.164 2.856
## EDDEP41 4.454  0.558 1.074 1.779 2.530
## EDDEP42 2.364  0.210 0.987 1.906 2.934
## EDDEP44 2.549  0.194 1.012 2.013 3.126
## EDDEP45 2.834  0.141 0.907 1.846 2.875
## EDDEP46 2.381 -0.458 0.478 1.546 2.632
## EDDEP48 3.185  0.198 0.782 1.526 2.324
## EDDEP50 2.018 -0.050 0.926 2.000 2.966
## EDDEP54 2.685 -0.299 0.423 1.358 2.308
## CESD1   2.074  0.876 1.921 3.064    NA
## CESD2   1.262  1.387 2.670 3.721    NA
## CESD3   3.512  0.833 1.316 1.949    NA
## CESD4   1.118  0.649 1.379 2.081    NA
## CESD5   1.605  0.429 1.526 2.724    NA
## CESD6   3.635  0.493 1.176 1.729    NA
## CESD7   1.828  0.287 1.368 2.134    NA
## CESD8   1.342 -0.067 0.823 1.620    NA
## CESD9   3.003  0.748 1.374 1.855    NA
## CESD10  2.060  1.172 2.043 3.268    NA
## CESD11  1.077 -0.463 0.947 2.160    NA
## CESD12  2.229  0.169 0.945 1.737    NA
## CESD13  1.288  0.342 1.696 2.915    NA
## CESD14  2.176  0.491 1.291 1.864    NA
## CESD15  1.397  0.965 2.321 3.604    NA
## CESD16  2.133  0.272 0.922 1.808    NA
## CESD17  1.719  1.607 2.317 3.470    NA
## CESD18  2.812  0.261 1.248 1.984    NA
## CESD19  1.834  0.784 1.875 2.639    NA
## CESD20  1.491 -0.140 1.256 2.297    NA
## 
## $means
##    F1 
## -0.06 
## 
## $cov
##      F1
## F1 0.95

and also other commonly used functions:

mirt::itemfit(out_calib, empirical.plot = 1)
mirt::itemplot(out_calib, item = 1, type = "info")
mirt::itemfit(out_calib, "S_X2", na.rm = TRUE)

Scale information functions can be plotted with plotInfo. The two required arguments are an output object from runCalibration() and a PROsetta object from loadData(). The additional arguments specify the labels, colors, and line types for each scale and the combined scale. The last values in arguments scale_label, color, lty represent the values for the combined scale.

plotInfo(
  out_calib, d,
  scale_label = c("PROMIS Depression", "CES-D", "Combined"),
  color = c("blue", "red", "black"),
  lty = c(1, 2, 3))

4 Scale aligning

runLinking() performs item parameter linking based on the anchor item parameters supplied in the anchor table. Two linking, or more specifically scaling aligning, methods currently available are fixed-parameter calibration and linear transformation. Fixed-parameter calibration estimates the item parameters for the non-anchor items on the metric defined by the anchor items, while fixing the item parameters for the anchor items to their supplied anchor values. The linear transformation methods determine linear transformation constants, i.e., a slope and an intercept, to transform freely estimated item parameters to the metric defined by the anchor items.

4.1 Fixed parameter calibration method

Scale aligning through fixed parameter calibration is performed by setting method = "FIXEDPAR". The linked parameters are stored in the $ipar_linked slot.

out_link_fixedpar <- runLinking(d, method = "FIXEDPAR")

out_link_fixedpar$ipar_linked

##                a          b1        b2       b3       b4
## EDDEP04 4.261422  0.40106943 0.9756732 1.696300 2.444072
## EDDEP05 3.931743  0.30494182 0.9130961 1.593476 2.411682
## EDDEP06 4.144759  0.35011299 0.9153482 1.678203 2.470526
## EDDEP07 2.801804  0.14774854 0.7723478 1.602715 2.538057
## EDDEP09 3.657433  0.31195821 0.9818088 1.782108 2.571127
## EDDEP14 2.333381  0.18599340 0.9473173 1.728770 2.632643
## EDDEP17 3.274033 -0.49845044 0.4059439 1.413052 2.375459
## EDDEP19 3.240973  0.46049359 1.0344267 1.833595 2.514716
## EDDEP21 2.736104  0.07245992 0.8097820 1.803067 2.673441
## EDDEP22 3.970028  0.20379963 0.7954889 1.648707 2.295496
## EDDEP23 2.564431 -0.03841113 0.6926952 1.652820 2.583629
## EDDEP26 3.093368 -0.35762208 0.4124999 1.403863 2.223961
## EDDEP27 2.920056  0.20434012 0.8909159 1.654652 2.528368
## EDDEP28 2.588339 -0.07908821 0.6326205 1.477330 2.327715
## EDDEP29 4.342918 -0.11730304 0.5977487 1.428166 2.272495
## EDDEP30 2.612846 -0.02337743 0.8683854 1.864303 2.826340
## EDDEP31 3.182866 -0.26089392 0.3967655 1.305464 2.133989
## EDDEP35 3.105857  0.04371372 0.7223523 1.638758 2.471487
## EDDEP36 3.483012 -0.53588456 0.3475707 1.346781 2.354790
## EDDEP39 3.131213  0.91802061 1.4813240 2.163996 2.856377
## EDDEP41 4.454157  0.55838282 1.0742430 1.779346 2.530080
## EDDEP42 2.364413  0.21006534 0.9870936 1.905901 2.933758
## EDDEP44 2.549164  0.19350024 1.0117460 2.013109 3.126494
## EDDEP45 2.833605  0.14071213 0.9065001 1.846097 2.875194
## EDDEP46 2.380628 -0.45788346 0.4779996 1.545663 2.631512
## EDDEP48 3.185244  0.19814448 0.7819065 1.525815 2.324082
## EDDEP50 2.018099 -0.05044210 0.9258678 1.999516 2.965507
## EDDEP54 2.685300 -0.29880847 0.4234598 1.357851 2.307647
## CESD1   2.074442  0.87588000 1.9208517 3.063586       NA
## CESD2   1.262450  1.38731970 2.6695226 3.720540       NA
## CESD3   3.512477  0.83271884 1.3159735 1.948794       NA
## CESD4   1.118259  0.64882022 1.3786370 2.081167       NA
## CESD5   1.604924  0.42935043 1.5261799 2.723723       NA
## CESD6   3.634668  0.49267515 1.1756152 1.729146       NA
## CESD7   1.827676  0.28707337 1.3677916 2.134161       NA
## CESD8   1.341784 -0.06693047 0.8227682 1.619581       NA
## CESD9   3.002587  0.74752445 1.3741501 1.855369       NA
## CESD10  2.060394  1.17178033 2.0425563 3.268036       NA
## CESD11  1.076593 -0.46319695 0.9471572 2.159803       NA
## CESD12  2.229316  0.16859288 0.9449513 1.736562       NA
## CESD13  1.288445  0.34213331 1.6959381 2.915162       NA
## CESD14  2.176407  0.49140714 1.2914091 1.864275       NA
## CESD15  1.396543  0.96457795 2.3205020 3.603940       NA
## CESD16  2.132818  0.27238860 0.9216832 1.808276       NA
## CESD17  1.718811  1.60657580 2.3169977 3.469721       NA
## CESD18  2.812159  0.26144281 1.2484443 1.984112       NA
## CESD19  1.833676  0.78350053 1.8751255 2.638592       NA
## CESD20  1.490735 -0.14037590 1.2558582 2.296856       NA

4.2 Linear transformation methods

Scale aligning through linear transformation is performed by setting the method argument to one of the following options:

MM (Mean-Mean)
MS (Mean-Sigma)
HB (Haebara)
SL (Stocking-Lord)

Arguments supplied to runLinking are passed onto mirt::mirt() internally. In case of nonconvergence in the free calibration step, runLinking() explicitly raises an error and does not return its results.

out_link_sl <- runLinking(d, method = "SL", technical = list(NCYCLES = 1000))
out_link_sl

The item parameter estimates linked to the anchor metric are stored in the $ipar_linked slot.

out_link_sl$ipar_linked

##                a           b1        b2       b3       b4
## EDDEP04 3.793161  0.469530551 1.0795350 1.723900 2.349080
## EDDEP05 3.320380  0.332903277 0.8987342 1.652724 2.495485
## EDDEP06 3.425968  0.387802947 1.0150063 1.791367 2.554685
## EDDEP07 2.515490  0.079235109 0.8041320 1.613013 2.489921
## EDDEP09 3.473136  0.333925352 1.0061928 1.872557 2.581161
## EDDEP14 2.593311  0.166259562 0.8908913 1.766003 2.578983
## EDDEP17 3.271634 -0.484050567 0.3934638 1.378818 2.485111
## EDDEP19 3.288519  0.594093189 1.1569664 1.908307 2.491027
## EDDEP21 2.658942 -0.008883032 0.7996149 1.769781 2.587528
## EDDEP22 3.694447  0.202954598 0.7692452 1.679950 2.211225
## EDDEP23 2.558598 -0.112441069 0.6118728 1.527154 2.413630
## EDDEP26 2.978928 -0.359701425 0.4134169 1.383127 2.206006
## EDDEP27 2.945571  0.220218470 0.9013523 1.670465 2.546562
## EDDEP28 2.533119 -0.089891132 0.6379695 1.516825 2.353177
## EDDEP29 3.641962 -0.108954575 0.6340853 1.413200 2.313974
## EDDEP30 2.366329 -0.144282442 0.8036419 1.852479 2.826838
## EDDEP31 2.859310 -0.272675089 0.3784892 1.397099 2.247133
## EDDEP35 2.833419 -0.091131191 0.6508336 1.594374 2.513021
## EDDEP36 3.831055 -0.409409211 0.4298663 1.362454 2.182998
## EDDEP39 2.921752  1.013208977 1.5973594 2.147477 2.830996
## EDDEP41 4.290619  0.403879976 1.0501835 1.653612 2.444910
## EDDEP42 2.203147 -0.061878352 0.8051774 1.819106 2.845104
## EDDEP44 2.487211  0.265078137 1.0153973 2.024893 3.207768
## EDDEP45 2.598389  0.071892748 0.9289133 2.065916 2.910995
## EDDEP46 2.580547 -0.300956201 0.4971925 1.463755 2.349175
## EDDEP48 3.192270  0.301874524 0.7763961 1.670600 2.269183
## EDDEP50 2.171919  0.094996852 0.9605635 1.915714 2.757297
## EDDEP54 2.962340 -0.178813349 0.5521133 1.377950 2.253458
## CESD1   2.061779  0.879468923 1.9314933 3.076904       NA
## CESD2   1.255554  1.392413300 2.6815860 3.738048       NA
## CESD3   3.508817  0.837483846 1.3247576 1.957232       NA
## CESD4   1.111395  0.650007324 1.3850762 2.091836       NA
## CESD5   1.593890  0.428582958 1.5343148 2.738424       NA
## CESD6   3.618615  0.492972124 1.1842575 1.739733       NA
## CESD7   1.811273  0.285422401 1.3764427 2.148182       NA
## CESD8   1.332344 -0.070810659 0.8265571 1.629126       NA
## CESD9   2.980182  0.751486735 1.3836642 1.866730       NA
## CESD10  2.054306  1.176185489 2.0499467 3.277152       NA
## CESD11  1.072249 -0.469512670 0.9481340 2.166840       NA
## CESD12  2.220284  0.164843960 0.9489153 1.745282       NA
## CESD13  1.279643  0.340717619 1.7050034 2.932279       NA
## CESD14  2.165046  0.490996018 1.2981449 1.874621       NA
## CESD15  1.388293  0.967787304 2.3321682 3.621488       NA
## CESD16  2.119598  0.270262537 0.9262685 1.818613       NA
## CESD17  1.720576  1.609372015 2.3204181 3.472175       NA
## CESD18  2.803351  0.257880402 1.2547114 1.992341       NA
## CESD19  1.820455  0.786210490 1.8859182 2.652889       NA
## CESD20  1.482218 -0.145778099 1.2604344 2.307638       NA

Transformation constants (A = slope; B = intercept) for the specified linear transformation method are stored in the $constants slot.

out_link_sl$constants

##         A         B 
##  0.982306 -0.064442

4.3 Obtaining scaled scores

From the item parameter estimates transformed to the anchor metric, raw-score-to-scale-score (rsss) crosswalk tables can be generated by runRSSS().

The output from runRSSS() includes three crosswalk tables (labeled as 1, 2, and combined), one for each scale and the third one for the combined scale. Each table contains raw summed scores and corresponding scaled scores, including summed score EAP estimate, T-scores corresponding to the EAP estimates, as well as expected summed scores (i.e., true scores) for each scale from the EAP estimates.

rsss_fixedpar <- runRSSS(d, out_link_fixedpar)
rsss_sl       <- runRSSS(d, out_link_sl)
round(rsss_fixedpar$`2`, 3)

##    raw_2 tscore tscore_se    eap eap_se escore_1 escore_2 escore_combined
## 1     20   34.5       6.0 -1.554  0.599   28.455   21.105          49.560
## 2     21   38.6       5.1 -1.139  0.509   29.340   21.851          51.192
## 3     22   41.1       4.7 -0.892  0.473   30.503   22.523          53.026
## 4     23   42.9       4.5 -0.713  0.455   31.837   23.154          54.991
## 5     24   44.7       4.1 -0.534  0.412   33.735   23.950          57.685
## 6     25   46.2       3.8 -0.382  0.382   35.853   24.777          60.630
## 7     26   47.5       3.6 -0.248  0.357   38.134   25.636          63.769
## 8     27   48.7       3.3 -0.128  0.335   40.558   26.535          67.093
## 9     28   49.8       3.2 -0.020  0.316   43.052   27.459          70.511
## 10    29   50.8       3.0  0.080  0.300   45.580   28.401          73.981
## 11    30   51.7       2.9  0.171  0.287   48.114   29.358          77.471
## 12    31   52.6       2.8  0.256  0.275   50.629   30.324          80.953
## 13    32   53.4       2.7  0.336  0.266   53.105   31.297          84.402
## 14    33   54.1       2.6  0.411  0.257   55.527   32.273          87.801
## 15    34   54.8       2.5  0.482  0.250   57.886   33.251          91.137
## 16    35   55.5       2.4  0.550  0.244   60.176   34.228          94.405
## 17    36   56.2       2.4  0.615  0.239   62.397   35.205          97.601
## 18    37   56.8       2.3  0.678  0.235   64.549   36.179         100.728
## 19    38   57.4       2.3  0.739  0.231   66.637   37.152         103.788
## 20    39   58.0       2.3  0.798  0.228   68.665   38.122         106.788
## 21    40   58.6       2.3  0.856  0.225   70.639   39.092         109.731
## 22    41   59.1       2.2  0.912  0.223   72.564   40.060         112.625
## 23    42   59.7       2.2  0.967  0.221   74.444   41.029         115.472
## 24    43   60.2       2.2  1.022  0.219   76.283   41.997         118.280
## 25    44   60.8       2.2  1.075  0.218   78.086   42.965         121.051
## 26    45   61.3       2.2  1.128  0.216   79.858   43.933         123.792
## 27    46   61.8       2.2  1.181  0.215   81.604   44.902         126.506
## 28    47   62.3       2.1  1.233  0.215   83.329   45.871         129.200
## 29    48   62.8       2.1  1.285  0.214   85.038   46.840         131.878
## 30    49   63.4       2.1  1.336  0.214   86.737   47.810         134.547
## 31    50   63.9       2.1  1.388  0.214   88.432   48.780         137.211
## 32    51   64.4       2.1  1.439  0.214   90.126   49.750         139.875
## 33    52   64.9       2.1  1.491  0.214   91.823   50.721         142.544
## 34    53   65.4       2.1  1.543  0.215   93.528   51.692         145.220
## 35    54   65.9       2.2  1.595  0.216   95.241   52.666         147.907
## 36    55   66.5       2.2  1.647  0.217   96.965   53.640         150.605
## 37    56   67.0       2.2  1.701  0.218   98.700   54.616         153.316
## 38    57   67.5       2.2  1.755  0.220  100.446   55.594         156.040
## 39    58   68.1       2.2  1.809  0.222  102.205   56.572         158.777
## 40    59   68.7       2.2  1.865  0.225  103.978   57.549         161.528
## 41    60   69.2       2.3  1.922  0.227  105.768   58.526         164.294
## 42    61   69.8       2.3  1.980  0.231  107.579   59.499         167.078
## 43    62   70.4       2.3  2.040  0.234  109.416   60.467         169.884
## 44    63   71.0       2.4  2.101  0.239  111.287   61.429         172.716
## 45    64   71.6       2.4  2.164  0.243  113.196   62.383         175.579
## 46    65   72.3       2.5  2.230  0.248  115.149   63.326         178.476
## 47    66   73.0       2.5  2.297  0.254  117.147   64.259         181.407
## 48    67   73.7       2.6  2.367  0.260  119.185   65.181         184.366
## 49    68   74.4       2.7  2.440  0.267  121.251   66.091         187.342
## 50    69   75.2       2.7  2.517  0.274  123.325   66.991         190.316
## 51    70   76.0       2.8  2.597  0.282  125.381   67.883         193.263
## 52    71   76.8       2.9  2.682  0.290  127.385   68.767         196.152
## 53    72   77.7       3.0  2.772  0.299  129.304   69.647         198.951
## 54    73   78.7       3.1  2.868  0.308  131.104   70.522         201.626
## 55    74   79.7       3.2  2.970  0.316  132.754   71.393         204.147
## 56    75   80.8       3.2  3.079  0.323  134.224   72.253         206.478
## 57    76   81.9       3.3  3.195  0.326  135.490   73.090         208.580
## 58    77   83.1       3.2  3.314  0.322  136.534   73.882         210.416
## 59    78   84.3       3.1  3.433  0.310  137.354   74.604         211.959
## 60    79   85.5       2.9  3.549  0.287  137.973   75.238         213.211
## 61    80   86.5       2.6  3.654  0.256  138.415   75.762         214.177

The columns in the crosswalk tables include:

raw_1: raw summed score in Scale 1 (also raw_2 for Scale 2 and raw_3 for the combined)
tscore: T-score corresponding to each summed score
tscore_se: standard error associated with each T-score
eap: summed score EAP equivalent for each raw summed score
eap_se: standard error associated with each EAP estimate
escore_1: expected summed score (true score) for Scale 1 given the EAP estimate
escore_2: expected summed score (true score) for Scale 2 given the EAP estimate
escore_combined: expected summed score (true score) for the combined scale given the EAP estimate

4.4 Equipercentile method: raw-raw

Equipercentile linking of observed summed scores is performed by runEquateObserved().

Cases with missing responses are removed to be able to generate correct summed scores in concordance tables.

This function requires four arguments:

scale_from: numeric index of the scale (as specified in the item map) to be linked
scale_to: numeric index of the scale (as specified in the item map) to serve as the anchor
eq_type: the type of equating to be performed, equipercentile for this example. See ?equate::equate for details.
smooth: the type of presmoothing to perform

By default, runEquateObserved() performs raw-raw equipercentile linking. In this example, each raw summed score in Scale 2 (CES-D, ranging from 20 to 80) is linked to a raw summed score equivalent in Scale 1 (PROMIS Depression, rangeing from 28 to 140) with loglinear presmoothing.

out_equate <- runEquateObserved(
  d, scale_from = 2, scale_to = 1,
  eq_type = "equipercentile", smooth = "loglinear")

The crosswalk table can be obtained from the concordance slot:

out_equate$concordance

##    raw_2     raw_1  raw_1_se raw_1_se_boot
## 1     20  28.26881 0.1620936     0.1572725
## 2     21  29.88299 0.3370778     0.3524869
## 3     22  31.53492 0.5039397     0.5224066
## 4     23  33.26052 0.6025077     0.6328627
## 5     24  35.04116 0.7605541     0.7796375
## 6     25  36.88206 0.9289581     0.9829007
## 7     26  38.78932 1.1047990     1.1851165
## 8     27  40.76619 1.2851847     1.3870162
## 9     28  42.81301 1.4675514     1.5856481
## 10    29  44.92722 1.6498695     1.7068410
## 11    30  47.10359 1.8307274     1.8496648
## 12    31  49.33456 2.0092834     2.0593139
## 13    32  51.61601 2.2830630     2.1836061
## 14    33  53.93979 2.4363247     2.3130168
## 15    34  56.28523 2.5838111     2.5257043
## 16    35  58.64518 2.8128224     2.7802025
## 17    36  61.00920 2.9230117     2.8931144
## 18    37  63.35840 3.0265741     3.0806429
## 19    38  65.68830 3.1931417     3.1946661
## 20    39  67.98827 3.2579872     3.3153735
## 21    40  70.24833 3.3160014     3.5869175
## 22    41  72.46413 3.3666557     3.8021905
## 23    42  74.63553 3.4696445     3.9777686
## 24    43  76.75786 3.4943506     4.0420584
## 25    44  78.83013 3.5156665     4.1275127
## 26    45  80.85299 3.5347321     4.0853398
## 27    46  82.82775 3.5532238     4.0560504
## 28    47  84.75618 3.5733950     4.1428855
## 29    48  86.64034 3.5981071     3.9777269
## 30    49  88.48280 3.5457997     3.9632469
## 31    50  90.28977 3.5652250     3.9612782
## 32    51  92.06044 3.5948509     3.9079733
## 33    52  93.79686 3.6397275     3.8545522
## 34    53  95.50073 3.7057230     3.9267090
## 35    54  97.18390 3.6406951     3.8852056
## 36    55  98.83962 3.7115896     3.8946961
## 37    56 100.46992 3.6731482     3.8106845
## 38    57 102.08880 3.7516801     3.7151516
## 39    58 103.68535 3.8710358     3.7631227
## 40    59 105.27044 3.8253417     3.8246051
## 41    60 106.84491 3.9636676     3.9868970
## 42    61 108.40599 3.9344712     4.1974309
## 43    62 109.96920 4.0965312     4.4477086
## 44    63 111.51836 4.3289986     4.4844314
## 45    64 113.08139 4.2728002     4.5890397
## 46    65 114.63738 4.5418401     4.6718360
## 47    66 116.20861 4.4956647     4.8020358
## 48    67 117.78972 4.7999844     5.1217881
## 49    68 119.38584 4.7678715     5.5287865
## 50    69 121.01883 5.0971545     5.7227347
## 51    70 122.67774 5.5168449     6.2039759
## 52    71 124.38914 5.4272766     6.2447116
## 53    72 126.17776 5.8150222     6.4440662
## 54    73 128.05913 6.2277413     6.6430281
## 55    74 130.07996 6.6010857     6.8569590
## 56    75 132.31127 6.8333301     6.6366633
## 57    76 134.91376 7.4633808     6.0750272
## 58    77 138.21074 6.3457311     4.1921004

4.5 Equipercentile method: raw-tscore

Raw summed scores can be linked to scaled scores (e.g., T-scores) directly by specifying type_to = 'tscore' in runEquateObserved(). In the following example, we map the raw summed scores from Scale 2 (CES-D, ranging from 20 to 80) onto the T-score equivalents in Scale 1 (PROMIS Depression, mean = 50 and SD = 10).

out_equate_tscore <- runEquateObserved(
  d, scale_from = 2, scale_to = 1,
  type_to = "tscore", rsss = rsss_fixedpar,
  eq_type = "equipercentile", smooth = "loglinear")

Again, the crosswalk table can be retrieved from the concordance slot:

out_equate_tscore$concordance

##    raw_2 tscore_1 tscore_1_se tscore_1_se_boot
## 1     20 33.60138   0.1015887        0.3824395
## 2     21 38.46567   0.2970127        1.0535405
## 3     22 41.72102   0.5244657        0.5765933
## 4     23 43.93269   0.7283759        0.6155252
## 5     24 45.56073   0.9008862        0.5917980
## 6     25 46.96056   1.0676547        0.6328539
## 7     26 48.14591   1.2231568        0.6331400
## 8     27 49.15361   1.4254088        0.6894146
## 9     28 50.08462   1.5481891        0.7041770
## 10    29 50.93574   1.6767468        0.7062753
## 11    30 51.69914   1.7912740        0.6876076
## 12    31 52.45221   1.9745160        0.6803545
## 13    32 53.09555   2.0699538        0.6922414
## 14    33 53.81778   2.1925590        0.7087734
## 15    34 54.43935   2.2939903        0.7020589
## 16    35 55.05249   2.3979162        0.6816096
## 17    36 55.65876   2.5043386        0.7053824
## 18    37 56.19032   2.5834537        0.7389035
## 19    38 56.74801   2.6939470        0.7416404
## 20    39 57.33437   2.8072216        0.7742922
## 21    40 57.91858   2.9233371        0.7935000
## 22    41 58.49310   3.0422752        0.7992520
## 23    42 58.98081   3.1276569        0.8429419
## 24    43 59.55737   3.2512041        0.8559615
## 25    44 60.13559   3.2571331        0.9369305
## 26    45 60.71592   3.3751754        1.0085218
## 27    46 61.29874   3.4950840        1.0432083
## 28    47 61.88462   3.6164946        1.0527379
## 29    48 62.38451   3.6947613        1.0711343
## 30    49 62.95738   3.8157601        1.0827359
## 31    50 63.54690   3.9367948        1.1340440
## 32    51 64.14007   4.0572121        1.2260907
## 33    52 64.73684   4.1763139        1.2526451
## 34    53 65.33702   4.2933638        1.2498766
## 35    54 65.94024   4.4075839        1.2700091
## 36    55 66.54594   4.5181439        1.3143697
## 37    56 67.24018   4.6807053        1.3974567
## 38    57 67.87194   4.7833243        1.4324382
## 39    58 68.48824   4.8795677        1.5091217
## 40    59 69.10342   4.9680991        1.6490404
## 41    60 69.76017   5.0472563        1.7192766
## 42    61 70.43094   5.1798624        1.7309410
## 43    62 71.04031   5.2369658        1.7453752
## 44    63 71.74400   5.3442550        1.7157082
## 45    64 72.39134   5.3682887        1.7417181
## 46    65 73.08172   5.4387347        1.9192059
## 47    66 73.76203   5.4859732        2.1170064
## 48    67 74.42874   5.5059813        2.1822968
## 49    68 75.13082   5.5660930        2.2521448
## 50    69 75.80853   5.5266085        2.4854035
## 51    70 76.58968   5.6021126        2.7011187
## 52    71 77.31778   5.2885420        3.1427644
## 53    72 78.17632   5.2651465        3.3316763
## 54    73 79.13091   5.3383723        3.5466934
## 55    74 80.17216   4.8749758        3.6745582
## 56    75 81.48607   4.8123350        3.8608612
## 57    76 83.14663   4.7538824        3.5389564
## 58    77 85.48248   3.1143191        2.4021610

In what follows we display the linking relation obtained from the equipercentile method and compare it to that from the fixed-parameter calibration method.

plot(
  rsss_fixedpar$`2`$raw_2,
  rsss_fixedpar$`2`$tscore,
  xlab = "CES-D Summed Score",
  ylab = "PROMIS Depression T-score",
  type = "l", col = "blue")
lines(
  out_equate_tscore$concordance$raw_2,
  out_equate_tscore$concordance$tscore_1,
  lty = 2, col = "red")
grid()
legend(
  "topleft",
  c("Fixed-Parameter Calibration", "Equipercentile Linking"),
  lty = 1:2, col = c("blue", "red"), bg = "white"
)

5 Evaluation of linking results

The linking results produced so far are now evaluated. More specifically, we assess how closely the CES-D summed scores linked to the PROMIS Depression T-scores match the actual PROMIS Depression T-scores observed in the present linking sample. Should we have set aside a validation sample, we would have performed this evaluation on that sample.

5.1 Raw scores from Scale 2

To begin with, we create an object scores using getScaleSum() to contain raw summed scores on Scale 2 (i.e., CES-D). NA will result for any respondents with one or more missing responses on Scale 2. We could also create a summed score variable for Scale 1 using the same function, e.g., getScaleSum(d, 1).

scores <- getScaleSum(d, 2)
head(scores)

##   prosettaid raw_2
## 1     100048    21
## 2     100049    21
## 3     100050    24
## 4     100051    26
## 5     100052    20
## 6     100053    21

5.2 EAP estimates based on item responses patterns on Scale 1

We obtain EAP estimates of theta on Scale 1 (i.e., PROMIS Depression) based on item response patterns using the getTheta() function. The first argument of the function is a data object of PROsetta class, which we created earlier with loadData(). The second argument specifies the item parameter estimates to be used for the EAP estimation. Here, we use the item parameter estimates previously obtained from the fixed-parameter calibration, out_link_fixedpar$ipar_linked. The third argument scale = 1 specifies the scale to be scored (i.e., PROMIS Depression). These EAP estimates are based on the item responses actually observed on PROMIS Depression and will serve as the reference when we assess the CES-D scores liked to PROMIS Depression derived from various methods.

eap_promis <- getTheta(d, out_link_fixedpar$ipar_linked, scale = 1)$theta
head(eap_promis)

##   prosettaid   theta_eap  theta_se
## 1     100048 -0.42410653 0.1606312
## 2     100049 -1.15269342 0.3229802
## 3     100050  0.05281656 0.1176509
## 4     100051 -0.04622278 0.1238298
## 5     100052 -1.65063866 0.5049212
## 6     100053 -0.55824065 0.1812502

The EAP estimates for PROMIS Depression will be converted to T-scores using a linear transformation.

t_promis <- data.frame(
  prosettaid = eap_promis$prosettaid,
  t_promis = round(eap_promis$theta_eap * 10 + 50, 1)
)
head(t_promis)

##   prosettaid t_promis
## 1     100048     45.8
## 2     100049     38.5
## 3     100050     50.5
## 4     100051     49.5
## 5     100052     33.5
## 6     100053     44.4

We then merge the PROMIS Depression T-scores with the raw summed scores for CES-D calculated in the previous step.

scores <- merge(scores, t_promis, by = "prosettaid")
head(scores)

##   prosettaid raw_2 t_promis
## 1     100048    21     45.8
## 2     100049    21     38.5
## 3     100050    24     50.5
## 4     100051    26     49.5
## 5     100052    20     33.5
## 6     100053    21     44.4

Now we are going to generate T-scores linked to PROMIS Depression using only item responses on Scale 2 (CES-D). These T-scores linked to PROMIS Depression can be generated in different ways as:

EAP estimates based on item response patterns on Scale 2
EAP estimates based on summed scores on Scale 2
Equipercentile equivalents based on summed scores on Scale 2

The first two ways are based on the CES-D item parameters linked to the PROMIS Depression metric.

5.3 EAP estimates based on item responses patterns on Scale 2

First, we get EAP estimates based on item response patterns on Scale 2 using the CES-D item parameters linked to the PROMIS Depression metric (via fixed-parameter calibration). We then linearly transform the EAP estimates to T-scores and add the T-scores (t_cesd_pattern) to the data frame object scores.

eap_cesd <- getTheta(d, out_link_fixedpar$ipar_linked, scale = 2)$theta
t_cesd_pattern <- data.frame(
  prosettaid = eap_cesd$prosettaid,
  t_cesd_pattern = round(eap_cesd$theta_eap * 10 + 50, 1)
)
scores <- merge(scores, t_cesd_pattern, by = "prosettaid")
head(scores)

##   prosettaid raw_2 t_promis t_cesd_pattern
## 1     100048    21     45.8           37.6
## 2     100049    21     38.5           37.6
## 3     100050    24     50.5           45.1
## 4     100051    26     49.5           48.5
## 5     100052    20     33.5           34.5
## 6     100053    21     44.4           37.6

5.4 EAP estimates based on summed scores on Scale 2

Second, we use the raw-score-to-scale-score (RSSS) crosswalk table obtained above using summed score EAP estimation to map each raw summed score on Scale 2 onto a T-score on the PROMIS Depression metric, t_cesd_rsss.

rsss_eap <- data.frame(
  raw_2 = rsss_fixedpar$`2`$raw_2,
  t_cesd_rsss = round(rsss_fixedpar$`2`$tscore, 1)
)
scores <- merge(scores, rsss_eap, by = "raw_2")
head(scores)

##   raw_2 prosettaid t_promis t_cesd_pattern t_cesd_rsss
## 1    20     101352     37.7           34.5        34.5
## 2    20     100621     54.6           34.5        34.5
## 3    20     104958     33.5           34.5        34.5
## 4    20     100059     38.9           34.5        34.5
## 5    20     104756     37.7           34.5        34.5
## 6    20     105375     37.6           34.5        34.5

5.5 Equipercentile linking of summed scores on Scale 2

Third, we use the concordance table from equipercentile linking to map each raw summed score on Scale 2 onto a T-score on the PROMIS Depression metric, t_cesd_eqp.

rsss_eqp <- data.frame(
  raw_2 = out_equate_tscore$concordance$raw_2,
  t_cesd_eqp = round(out_equate_tscore$concordance$tscore_1, 1)
)
scores <- merge(scores, rsss_eqp, by = "raw_2")
head(scores)

##   raw_2 prosettaid t_promis t_cesd_pattern t_cesd_rsss t_cesd_eqp
## 1    20     101352     37.7           34.5        34.5       33.6
## 2    20     100621     54.6           34.5        34.5       33.6
## 3    20     104958     33.5           34.5        34.5       33.6
## 4    20     100059     38.9           34.5        34.5       33.6
## 5    20     104756     37.7           34.5        34.5       33.6
## 6    20     105375     37.6           34.5        34.5       33.6

5.6 Comparison of equated and observed T-scores

Finally, use compareScores() to compare the obtained T-scores.

# Reference score: IRT pattern scoring of Scale 1
c_pattern <- compareScores(
  scores$t_promis, scores$t_cesd_pattern) ## IRT response pattern EAP to T-score
c_rsss <- compareScores(
  scores$t_promis, scores$t_cesd_rsss)    ## IRT summed score EAP to T-score
c_eqp <- compareScores(
  scores$t_promis, scores$t_cesd_eqp)     ## Equipercentile summed score to T-score

stats           <- rbind(c_pattern, c_rsss, c_eqp)
rownames(stats) <- c("IRT Pattern", "IRT RSSS", "Equipercentile")
stats

##                     corr        mean       sd     rmsd      mad
## IRT Pattern    0.8437275  0.31409029 5.452607 5.461646 4.237209
## IRT RSSS       0.8212425  0.09425445 5.772118 5.772887 4.442818
## Equipercentile 0.8153648 -0.01121751 5.849425 5.849436 4.450889

Scale linking with PROsetta package