# Check perfect model

## Objective

In general, the broken stick model smoothes the observed growth trajectory. What happens of all observations are already aligned to the break ages? Does the model perfectly represent the data? Is the covariance matrix of the random effects ($$\Omega)$$ equal to the covariance between the measurements? Is $$\sigma^2$$ equal to zero?

## Data generation

We adapt code from http://www.davekleinschmidt.com/sst-mixed-effects-simulation/simulations_slides.pdf to generate test data:

library("plyr")
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
##
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
library("mvtnorm")
make_data_generator <- function(resid_var = 1,
ranef_covar = diag(c(1, 1)), n = 100
) {
ni <- nrow(ranef_covar)
generate_data <- function() {
# sample data set under mixed effects model with random slope/intercepts
simulated_data <- rdply(n, {
b <- t(rmvnorm(n = 1, sigma = ranef_covar))
epsilon <- rnorm(n = length(b), mean = 0, sd = sqrt(resid_var))
b + epsilon
})
data.frame(
subject = rep(1:n, each = ni),
age = rep(1:ni, n),
simulated_data)
}
}

Let us first model the perfect situation where $$\sigma^2 = 0$$ (so we set resid_var to zero) and where the ages align perfectly.

set.seed(77711)
covar <- matrix(c(1, 0.7, 0.5, 0.3,
0.7, 1, 0.8, 0.5,
0.5, 0.8, 1, 0.6,
0.3, 0.5, 0.6, 1), nrow = 4)
gen_dat <- make_data_generator(n = 10000,
ranef_covar = covar,
resid_var = 2)
data <- gen_dat()
head(data)
##   subject age .n     X1
## 1       1   1  1 -0.958
## 2       1   2  1 -2.281
## 3       1   3  1 -2.713
## 4       1   4  1 -2.677
## 5       2   1  2  0.017
## 6       2   2  2 -1.517

Check the correlation matrix of the $$y$$’s.

library("tidyr")
library("dplyr")
d <- as_tibble(data[,-3])
cor(broad)
##      [,1] [,2] [,3] [,4]
## [1,] 1.00 0.23 0.17 0.12
## [2,] 0.23 1.00 0.27 0.16
## [3,] 0.17 0.27 1.00 0.22
## [4,] 0.12 0.16 0.22 1.00

## Fit model

Fit broken stick model, with knots specified at ages 1:4.

library("brokenstick")
knots <- 1:3
boundary <- c(1, 4)
fit <- brokenstick(X1 ~ age | subject, data,
knots = knots, boundary = boundary,
method = "lmer")
## Warning: number of observations (=40000) <= number of random effects (=40000)
## for term (0 + age_1 + age_2 + age_3 + age_4 | subject); the random-effects
## parameters and the residual variance (or scale parameter) are probably
## unidentifiable
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0025362 (tol = 0.002, component 1)
omega <- fit$omega beta <- fit$beta
sigma2 <- fit\$sigma2
round(beta, 2)
## age_1 age_2 age_3 age_4
## -0.03 -0.03  0.02  0.01
round(sigma2, 4)
## [1] 1.3
# correlation random effects
round(covar, 3)
##      [,1] [,2] [,3] [,4]
## [1,]  1.0  0.7  0.5  0.3
## [2,]  0.7  1.0  0.8  0.5
## [3,]  0.5  0.8  1.0  0.6
## [4,]  0.3  0.5  0.6  1.0
round(omega, 2)
##       age_1 age_2 age_3 age_4
## age_1  1.72  0.70  0.51  0.35
## age_2  0.70  1.62  0.81  0.48
## age_3  0.51  0.81  1.70  0.66
## age_4  0.35  0.48  0.66  1.65
# covariances measured data
round(omega + diag(sigma2, 4), 3)
##       age_1 age_2 age_3 age_4
## age_1  3.06  0.70  0.51  0.35
## age_2  0.70  2.96  0.81  0.48
## age_3  0.51  0.81  3.04  0.66
## age_4  0.35  0.48  0.66  2.99
round(cov(broad), 3)
##      [,1] [,2] [,3] [,4]
## [1,] 3.06 0.70 0.51 0.35
## [2,] 0.70 2.96 0.81 0.48
## [3,] 0.51 0.81 3.04 0.66
## [4,] 0.35 0.48 0.66 2.99
# convert to time-to-time correlation matrix
round(cov2cor(omega + diag(sigma2, 4)), 3)
##       age_1 age_2 age_3 age_4
## age_1  1.00  0.23  0.17  0.12
## age_2  0.23  1.00  0.27  0.16
## age_3  0.17  0.27  1.00  0.22
## age_4  0.12  0.16  0.22  1.00
round(cor(broad), 3)
##      [,1] [,2] [,3] [,4]
## [1,] 1.00 0.23 0.17 0.12
## [2,] 0.23 1.00 0.27 0.16
## [3,] 0.17 0.27 1.00 0.22
## [4,] 0.12 0.16 0.22 1.00

## Conclusions

1. If $$\sigma^2=0$$, then the off-diagonal elements of $$\Omega$$ reproduce the correlations among the $$y$$’s. The estimate of $$\sigma^2$$ is too high (about 0.13 instead of 0).
2. If $$\sigma^2 > 0$$, then $$\hat C = \Omega + \hat\sigma^2 I(n_i)$$ reproduces the sample covariance matrix between $$y$$’s exactly.
3. cov2cor(hatC) reproduces the sample time-to-time correlation matrix.

• Main functions
• Plot trajectories
• Orginal scale and $$Z$$-score scale
• 1-line model
• 2-line broken stick model
• 9-line broken stick model
• Prediction
• Subject-level analysis
• Broken Stick Model for Irregular Longitudinal Data
• Irregular observation times
• Literature overview
• Definition of the model
• Interpretation of the model
• Estimation by lmer and kr methods
• Software overview
• brokenstick() for model fitting
• predict() for trajectory plotting
• Conversion back and forth to the $$Z$$-score scale
• Predict growth curve of new subjects
• Assess the quality of the model
• Knot placement strategies
• Critical periods
• Time-to-time correlations
• Profile analysis
• Curve interpolation
• Multiple imputation
• Curve matching
• Discussion
• Help for old friends
• Properties of the perfect model
• Estimating time-to-time correlations