In general, the broken stick model smoothes the observed growth trajectory. What happens of all observations are already aligned to the break ages? Does the model perfectly represent the data? Is the covariance matrix of the random effects (\(\Omega)\) equal to the covariance between the measurements? Is \(\sigma^2\) equal to zero?
We adapt code from http://www.davekleinschmidt.com/sst-mixed-effects-simulation/simulations_slides.pdf to generate test data:
library("plyr")
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
library("mvtnorm")
<- function(resid_var = 1,
make_data_generator ranef_covar = diag(c(1, 1)), n = 100
) {<- nrow(ranef_covar)
ni <- function() {
generate_data # sample data set under mixed effects model with random slope/intercepts
<- rdply(n, {
simulated_data <- t(rmvnorm(n = 1, sigma = ranef_covar))
b <- rnorm(n = length(b), mean = 0, sd = sqrt(resid_var))
epsilon + epsilon
b
})data.frame(
subject = rep(1:n, each = ni),
age = rep(1:ni, n),
simulated_data)
} }
Let us first model the perfect situation where \(\sigma^2 = 0\) (so we set resid_var
to zero) and where the ages align perfectly.
set.seed(77711)
<- matrix(c(1, 0.7, 0.5, 0.3,
covar 0.7, 1, 0.8, 0.5,
0.5, 0.8, 1, 0.6,
0.3, 0.5, 0.6, 1), nrow = 4)
<- make_data_generator(n = 10000,
gen_dat ranef_covar = covar,
resid_var = 2)
<- gen_dat()
data head(data)
## subject age .n X1
## 1 1 1 1 -0.95843228
## 2 1 2 1 -2.28067938
## 3 1 3 1 -2.71322094
## 4 1 4 1 -2.67658325
## 5 2 1 2 0.01657697
## 6 2 2 2 -1.51680763
Check the correlation matrix of the \(y\)’s.
library("tidyr")
library("dplyr")
<- as_tibble(data[,-3])
d <- t(spread(d, subject, X1))[-1,]
broad cor(broad)
## [,1] [,2] [,3] [,4]
## [1,] 1.0000000 0.2322704 0.1683649 0.1150505
## [2,] 0.2322704 1.0000000 0.2701451 0.1605367
## [3,] 0.1683649 0.2701451 1.0000000 0.2187383
## [4,] 0.1150505 0.1605367 0.2187383 1.0000000
Fit broken stick model, with knots specified at ages 1:4
.
library("brokenstick")
<- 1:3
knots <- c(1, 4)
boundary <- brokenstick(X1 ~ age | subject, data,
fit knots = knots, boundary = boundary,
method = "lmer")
## Warning: number of observations (=40000) <= number of random effects (=40000)
## for term (0 + age_1 + age_2 + age_3 + age_4 | subject); the random-effects
## parameters and the residual variance (or scale parameter) are probably
## unidentifiable
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0025362 (tol = 0.002, component 1)
<- fit$omega
omega <- fit$beta
beta <- fit$sigma2
sigma2 round(beta, 2)
## age_1 age_2 age_3 age_4
## -0.03 -0.03 0.02 0.01
round(sigma2, 4)
## [1] 1.3392
# correlation random effects
round(covar, 3)
## [,1] [,2] [,3] [,4]
## [1,] 1.0 0.7 0.5 0.3
## [2,] 0.7 1.0 0.8 0.5
## [3,] 0.5 0.8 1.0 0.6
## [4,] 0.3 0.5 0.6 1.0
round(omega, 2)
## age_1 age_2 age_3 age_4
## age_1 1.72 0.70 0.51 0.35
## age_2 0.70 1.62 0.81 0.48
## age_3 0.51 0.81 1.70 0.66
## age_4 0.35 0.48 0.66 1.65
# covariances measured data
round(omega + diag(sigma2, 4), 3)
## age_1 age_2 age_3 age_4
## age_1 3.057 0.699 0.513 0.348
## age_2 0.699 2.959 0.810 0.477
## age_3 0.513 0.810 3.040 0.659
## age_4 0.348 0.477 0.659 2.988
round(cov(broad), 3)
## [,1] [,2] [,3] [,4]
## [1,] 3.057 0.699 0.513 0.348
## [2,] 0.699 2.959 0.810 0.477
## [3,] 0.513 0.810 3.040 0.659
## [4,] 0.348 0.477 0.659 2.988
# convert to time-to-time correlation matrix
round(cov2cor(omega + diag(sigma2, 4)), 3)
## age_1 age_2 age_3 age_4
## age_1 1.000 0.232 0.168 0.115
## age_2 0.232 1.000 0.270 0.161
## age_3 0.168 0.270 1.000 0.219
## age_4 0.115 0.161 0.219 1.000
round(cor(broad), 3)
## [,1] [,2] [,3] [,4]
## [1,] 1.000 0.232 0.168 0.115
## [2,] 0.232 1.000 0.270 0.161
## [3,] 0.168 0.270 1.000 0.219
## [4,] 0.115 0.161 0.219 1.000
cov2cor(hatC)
reproduces the sample time-to-time correlation matrix.