Skip to contents

7.1 The “standard” specification of the multilevel model for change

In Chapter 7 Singer and Willett (2003) examine the generalized least squares approach to modelling change using artificial data created by Willett (1988), who simulated changes in performance on a hypothetical “opposites naming” task over a four week period in a sample of 35 people.

For this example we use the opposites_naming data set, a person-period data frame with 140 rows and 5 columns:

  • id: Participant ID.
  • wave: Wave of measurement.
  • time: Wave of measurement centred at time 0.
  • opposites_naming_score: Score on the “opposites naming” task.
  • baseline_cognitive_score: Baseline score on a standardized instrument assessing general cognitive skill.
opposites_naming
#> # A tibble: 140 × 5
#>    id     wave  time opposites_naming_score baseline_cognitive_score
#>    <fct> <dbl> <dbl>                  <dbl>                    <dbl>
#>  1 1         1     0                    205                      137
#>  2 1         2     1                    217                      137
#>  3 1         3     2                    268                      137
#>  4 1         4     3                    302                      137
#>  5 2         1     0                    219                      123
#>  6 2         2     1                    243                      123
#>  7 2         3     2                    279                      123
#>  8 2         4     3                    302                      123
#>  9 3         1     0                    142                      129
#> 10 3         2     1                    212                      129
#> # ℹ 130 more rows

As the person-level version of the opposites_naming data below shows, this is a time-structured data set with four measurements per participant, and a time-invariant predictor reflecting participant’s cognitive skill at baseline.

opposites_naming_pl <- opposites_naming |>
  select(-time) |>
  pivot_wider(
    names_from = wave,
    values_from = opposites_naming_score,
    names_prefix = "opp_"
  ) |>
  relocate(baseline_cognitive_score, .after = everything())

# Table 7.1:
head(opposites_naming_pl, 10)
#> # A tibble: 10 × 6
#>    id    opp_1 opp_2 opp_3 opp_4 baseline_cognitive_score
#>    <fct> <dbl> <dbl> <dbl> <dbl>                    <dbl>
#>  1 1       205   217   268   302                      137
#>  2 2       219   243   279   302                      123
#>  3 3       142   212   250   289                      129
#>  4 4       206   230   248   273                      125
#>  5 5       190   220   229   220                       81
#>  6 6       165   205   207   263                      110
#>  7 7       170   182   214   268                       99
#>  8 8        96   131   159   213                      113
#>  9 9       138   156   197   200                      104
#> 10 10      216   252   274   298                       96

We begin by fitting a “standard” multilevel model for change to the opposites_naming data, which will serve as a point of comparison against alternative models with different error covariance structures from the “standard” model. Our “standard” model for the opposites_naming data takes the familiar form:

Level 1:opposites_naming_scoreij=π0i+π1itimeij+ϵijLevel 2:π0i=γ00+γ01(baseline_cognitive_scorei113.4571)+ζ0iπ1i=γ10+γ11(baseline_cognitive_scorei113.4571)+ζ1i. \begin{alignat}{2} \text{Level 1:} \\ &\text{opposites_naming_score}_{ij} = \pi_{0i} + \pi_{1i} \text{time}_{ij} + \epsilon_{ij} \\ \text{Level 2:} \\ &\pi_{0i} = \gamma_{00} + \gamma_{01} (\text{baseline_cognitive_score}_i - 113.4571) + \zeta_{0i} \\ &\pi_{1i} = \gamma_{10} + \gamma_{11} (\text{baseline_cognitive_score}_i - 113.4571) + \zeta_{1i}. \end{alignat}

where

ϵijiidNormal(0,σϵ), \epsilon_{ij} \stackrel{iid}{\sim} \operatorname{Normal}(0, \sigma_\epsilon),

and

[ζ0iζ1i]iid(N[00],[σ02σ10σ10σ12]). \begin{bmatrix} \zeta_{0i} \\ \zeta_{1i} \end{bmatrix} \stackrel{iid}{\sim} \begin{pmatrix} N \begin{bmatrix} 0 \\ 0 \end{bmatrix}, \begin{bmatrix} \sigma^2_0 & \ \sigma_{10} \\ \ \sigma_{10} & \sigma^2_1 \end{bmatrix} \end{pmatrix}.

As Singer and Willett (2003) discuss, because the focus of this chapter is on the error covariance structure of the multilevel model for change, we fit this model using restricted maximum likelihood so that the its goodness-of-fit statistics reflect only the stochastic portion of the model’s fit.

Additionally, we will use the lme() function from the nlme package instead of lme4’s lmer() function to fit this model, because the former has methods that make examining the fitted model’s the error covariance structure easier. The API for the lme() function is similar to that of the lmer() function, except that the fixed and random effects are specified in separate formulas rather than a single formula.

# Fit model -------------------------------------------------------------------
opposites_naming_fit_standard <- lme(
  opposites_naming_score ~ time * I(baseline_cognitive_score - 113.4571),
  random = ~ time | id,
  data = opposites_naming,
  method = "REML"
)

# Make table ------------------------------------------------------------------
options(modelsummary_get = "all")

# Table 7.2, page 246:
opposites_naming_fit_standard |>
  list() |>
  set_names("Estimate") |>
  modelsummary(
    fmt = 2,
    statistic = NULL,
    effects = c("var_model", "ran_pars", "fixed"),
    scales = c("vcov", "vcov", NA),
    coef_map = c(
      "(Intercept)",
      "I(baseline_cognitive_score - 113.4571)",
      "time",
      "time:I(baseline_cognitive_score - 113.4571)",
      "var_Observation",
      "var_(Intercept)",
      "var_time",
      "cov_time.(Intercept)"
    ),
    gof_map = list(
      list(
        raw = "logLik",
        clean = "Deviance",
        fmt = \(.x) vec_fmt_number(
          -2*as.numeric(.x), decimals = 1, sep_mark = ""
        )
      ),
      list(
        raw = "AIC",
        clean = "AIC",
        fmt = fmt_decimal(1)
      ),
      list(
        raw = "BIC",
        clean = "BIC",
        fmt = fmt_decimal(1)
      )
    ),
    output = "gt"
  ) |>
  tab_row_group(label = "Goodness-of-Fit", rows = 9:11) |>
  tab_row_group(label = "Variance Components", rows = 5:8) |>
  tab_row_group(label = "Fixed Effects", rows = 1:4)
Estimate
Fixed Effects
(Intercept) 164.37
I(baseline_cognitive_score - 113.4571) -0.11
time 26.96
time:I(baseline_cognitive_score - 113.4571) 0.43
Variance Components
var_Observation 159.48
var_(Intercept) 1236.42
var_time 107.25
cov_time.(Intercept) -178.24
Goodness-of-Fit
Deviance 1260.3
AIC 1276.3
BIC 1299.6

7.2 Using the composite model to understand assumptions about the error covariance matrix

In Section 7.2 Singer and Willett (2003) examine the error covariance structure implied by the “standard” multilevel model for change, given its random effects specification. To do so, we begin substituting the level-2 equations into level-1, yielding a composite representation of the “standard” model:

oppij=γ00+γ10timeij+γ01(cogi113.4571)+γ11timeij(cogi113.4571)+rij, \text{opp}_{ij} = \gamma_{00} + \gamma_{10}\text{time}_{ij} + \gamma_{01}(\text{cog}_i - 113.4571) + \gamma_{11}\text{time}_{ij}(\text{cog}_i - 113.4571) + r_{ij},

where the composite residual, rijr_{ij}, represents the weighted linear combination of the model’s original three random effects:

rij=ϵij+ζ0i+ζ1itimeij. r_{ij} = \epsilon_{ij} + \zeta_{0i} + \zeta_{1i} \text{time}_{ij}.

Notice that the composite model now looks like a typical multiple regression model—with the usual error term, ϵi\epsilon_i, replaced by the composite residual, rijr_{ij}. Following from this observation, we can reexpress the distributional assumptions for all the residuals of the “standard” multilevel model for change in one grand statement based on the composite residual:

rN(𝟎,[𝚺r𝟎𝟎𝟎𝟎𝚺r𝟎𝟎𝟎𝟎𝚺r𝟎𝟎𝟎𝟎𝟎𝟎𝚺r]), r \sim N \begin{pmatrix} \mathbf 0, \begin{bmatrix} \mathbf{\Sigma}_r & \mathbf 0 & \mathbf 0 & \dots & \mathbf 0 \\ \mathbf 0 & \mathbf{\Sigma}_r & \mathbf 0 & \dots & \mathbf 0 \\ \mathbf 0 & \mathbf 0 & \mathbf{\Sigma}_r & \dots & \mathbf 0 \\ \vdots & \vdots & \vdots & \ddots & \mathbf 0 \\ \mathbf 0 & \mathbf 0 & \mathbf 0 & \mathbf 0 & \mathbf{\Sigma}_r \end{bmatrix} \end{pmatrix},

where 𝚺r\mathbf{\Sigma}_r represents a block diagonal error covariance sub-matrix whose dimensions reflect the design of the opposites_naming data, given by:

$$ 𝚺r=[σr12σr1r2σr1r3σr1r4σr2r1σr22σr2r3σr2r4σr3r1σr3r2σr32σr3r4σr4r1σr4r2σr4r3σr42],\begin{align} \mathbf{\Sigma}_r & = \begin{bmatrix} \sigma_{r_1}^2 & \sigma_{r_1 r_2} & \sigma_{r_1 r_3} & \sigma_{r_1 r_4} \\ \sigma_{r_2 r_1} & \sigma_{r_2}^2 & \sigma_{r_2 r_3} & \sigma_{r_2 r_4} \\ \sigma_{r_3 r_1} & \sigma_{r_3 r_2} & \sigma_{r_3}^2 & \sigma_{r_3 r_4} \\ \sigma_{r_4 r_1} & \sigma_{r_4 r_2} & \sigma_{r_4 r_3} & \sigma_{r_4}^2 \end{bmatrix}, \end{align} $$

with occasion-specific composite residual variances

σrj2=Var(ϵij+ζ0i+ζ1itimej)=σϵ2+σ02+2σ01timej+σ12timej2, \begin{align} \sigma_{r_j}^2 &= \operatorname{Var} \left( \epsilon_{ij} + \zeta_{0i} + \zeta_{1i} \text{time}_j \right) \\ &= \sigma_\epsilon^2 + \sigma_0^2 + 2 \sigma_{01} \text{time}_j + \sigma_1^2 \text{time}_j^2, \end{align}

and occasion-specific composite residual covariances

σrj,rj=σ02+σ01(timej+timej)+σ12timejtimej, \sigma_{r_j, r_{j'}} = \sigma_0^2 + \sigma_{01} (\text{time}_j + \text{time}_{j'}) + \sigma_1^2 \text{time}_j \text{time}_{j'},

where all the terms have their usual meanings. We can retrieve the error covariance sub-matrix, 𝚺r\mathbf{\Sigma}_r, for the opposites_naming_fit_standard fit using the getVarCov() function from the nlme package. To emphasize that 𝚺r\mathbf{\Sigma}_r is the same for each participant, we retrieve it here for both the first and last participants in the data.

opposites_naming_varcov_standard <- opposites_naming_fit_standard |>
  getVarCov(type = "marginal", individuals = c(1, 35))

opposites_naming_varcov_standard
#> id 1 
#> Marginal variance covariance matrix
#>         1       2       3       4
#> 1 1395.90 1058.20  879.95  701.71
#> 2 1058.20 1146.70  916.21  845.22
#> 3  879.95  916.21 1111.90  988.73
#> 4  701.71  845.22  988.73 1291.70
#>   Standard Deviations: 37.362 33.863 33.346 35.94 
#> id 35 
#> Marginal variance covariance matrix
#>         1       2       3       4
#> 1 1395.90 1058.20  879.95  701.71
#> 2 1058.20 1146.70  916.21  845.22
#> 3  879.95  916.21 1111.90  988.73
#> 4  701.71  845.22  988.73 1291.70
#>   Standard Deviations: 37.362 33.863 33.346 35.94

For descriptive purposes, we can also convert 𝚺r\mathbf{\Sigma}_r into a correlation matrix using the cov2cor() function, to examine the residual autocorrelation between measurement occasions.

cov2cor(opposites_naming_varcov_standard[[1]])
#>           1         2         3         4
#> 1 1.0000000 0.8364012 0.7062988 0.5225751
#> 2 0.8364012 1.0000000 0.8113955 0.6944913
#> 3 0.7062988 0.8113955 1.0000000 0.8249967
#> 4 0.5225751 0.6944913 0.8249967 1.0000000

As Singer and Willett (2003) discuss, examining the equations and outputs above reveals two important properties about the occasion-specific residuals of the “standard” multilevel model for change:

  • They can be both heteroscedastic and autocorrelated within participants (but remember that across participants, they are identically heteroscedastic and autocorrelated under the homogeneity assumption).
  • They have a powerful dependence on time. Specifically, the residual variances, σrj2\sigma_{r_j}^2, have a quadratic dependence on time that will be at its minimum at time time=(σ01/σ12)\text{time} = -(\sigma_{01} / \sigma_1^2) and will increase parabolically and symmetrically over time on either side of this minimum; and the residual covariances, σrj,rj\sigma_{r_j, r_{j'}}, have an (imperfect) band diagonal structure wherein the overall magnitude of the residual covariances tends to decline in diagonal “bands” out from the main diagonal.

The first of these properties—allowing for heteroscedasticity and autocorrelation among the composite residuals—is a necessity given the anticipated demands of longitudinal data. Not all longitudinal data sets will be heteroscedastic or autocorrelated, but a credible model for change should should allow for potential heteroscedasticity and autocorrelation.

One advantage of the “standard” multilevel model for change is that—although its composite residuals have a powerful dependence on time—it is also capable of adapting itself relatively smoothly to many common empirical situations, accommodating automatically for certain kinds of complex error structure. Nonetheless, Singer and Willett (2003) conclude by questioning whether the the hypothesized structure of the error covariance matrix implied by the “standard” model can be applied ubiquitously, or if there may be empirical situations where directly modelling alternative error covariance structures may be preferable.

7.3 Postulating an alternative error covariance structure

In section 7.3 Singer and Willett (2003) discuss how alternative error covariance structures can be modelled directly using an extended linear model for change with heteroscedastic, correlated errors fitted by generalized least squares regression. See Chapter 5 of Pinheiro and Bates (2010) for discussion of the extended linear model.

We can fit the extended linear model for change with the gls() function from the nlme package, which allows us to model the within-group heteroscedasticity and correlation structures via the weights and correlation arguments, respectively.

Here we will fit six models to the opposites_naming data with the following error covariance structures: unstructured, compound symmetric, heterogeneous compound symmetric, autoregressive, heterogeneous autoregressive, and Toeplitz. Notice that unlike the multilevel model for change, the extended linear model for change has no random effects.

Equation Code

We need these equations for the table below—but there’s nothing interesting to look at here—hence why this section is collapsed.

hypothesized_varcov <- list(
  unstructured = r"($$\mathbf{\Sigma}_r = \begin{bmatrix}\sigma_1^2 & \sigma_{12} & \sigma_{13} & \sigma_{14} \\ \sigma_{21} & \sigma_2^2 & \sigma_{23} & \sigma_{24} \\ \sigma_{31} & \sigma_{32} & \sigma_3^2 & \sigma_{34} \\ \sigma_{41} & \sigma_{42} & \sigma_{43} & \sigma_4^2 \end{bmatrix}$$)",
  compsymm = r"($$\mathbf{\Sigma}_r = \begin{bmatrix} \sigma^2 + \sigma_1^2 & \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma^2 + \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma^2 + \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 & \sigma^2 + \sigma_1^2 \end{bmatrix}$$)",
  hetcompsymm = r"($$\mathbf{\Sigma}_r = \begin{bmatrix} \sigma_1^2 & \sigma_1 \sigma_2 \rho & \sigma_1 \sigma_3 \rho & \sigma_1 \sigma_4 \rho \\ \sigma_2 \sigma_1 \rho & \sigma_1^2 & \sigma_2 \sigma_3 \rho & \sigma_2 \sigma_4 \rho \\ \sigma_3 \sigma_1 \rho & \sigma_3 \sigma_2 \rho & \sigma_3^2 & \sigma_3 \sigma_4 \rho \\ \sigma_4 \sigma_1 \rho & \sigma_4 \sigma_2 \rho & \sigma_4 \sigma_3 \rho & \sigma_4^2 \end{bmatrix}$$)",
  ar = r"($$\mathbf{\Sigma}_r = \begin{bmatrix} \sigma^2 & \sigma^2 \rho & \sigma^2 \rho^2 & \sigma^2 \rho^3 \\ \sigma^2 \rho & \sigma^2 & \sigma^2 \rho & \sigma^2 \rho^2 \\ \sigma^2 \rho^2 & \sigma^2 \rho & \sigma^2 & \sigma^2 \rho \\ \sigma^2 \rho^3 & \sigma^2 \rho^2 & \sigma^2 \rho & \sigma^2 \end{bmatrix}$$)",
  hetar = r"($$\mathbf{\Sigma}_r = \begin{bmatrix} \sigma_1^2 & \sigma_1 \sigma_2 \rho & \sigma_1 \sigma_3 \rho^2 & \sigma_1 \sigma_4 \rho^3 \\ \sigma_2 \sigma_1 \rho & \sigma_2^2 & \sigma_2 \sigma_3 \rho & \sigma_2 \sigma_4 \rho^2 \\ \sigma_3 \sigma_1 \rho^2 & \sigma_3 \sigma_2 \rho & \sigma_3^2 & \sigma_3 \sigma_4 \rho \\ \sigma_4 \sigma_1 \rho^3 & \sigma_4 \sigma_2 \rho^2 & \sigma_4 \sigma_3 \rho & \sigma_4^2 \end{bmatrix}
$$)",
  toeplitz = r"($$\mathbf{\Sigma}_r = \begin{bmatrix}\sigma^2 & \sigma_1 & \sigma_2 & \sigma_3 \\ \sigma_1 & \sigma^2 & \sigma_1 & \sigma_2 \\ \sigma_2 & \sigma_1 & \sigma^2 & \sigma_1 \\ \sigma_3 & \sigma_2 & \sigma_1 & \sigma^2 \end{bmatrix}
$$)"
)
# Fit models ------------------------------------------------------------------

# Start with a base model we can update with the alternative error covariance
# structures. Note that we won't display this model in the table.
opposites_naming_fit <- gls(
  opposites_naming_score ~ time * I(baseline_cognitive_score - 113.4571),
  method = "REML",
  data = opposites_naming
)

# Unstructured:
opposites_naming_fit_unstructured <- update(
  opposites_naming_fit,
  correlation = corSymm(form = ~ 1 | id),
  weights = varIdent(form = ~ 1 | wave)
)

# Compound symmetry:
opposites_naming_fit_compsymm <- update(
  opposites_naming_fit,
  correlation = corCompSymm(form = ~ 1 | id)
)

# Heterogeneous compound symmetry:
opposites_naming_fit_hetcompsymm <- update(
  opposites_naming_fit_compsymm,
  weights = varIdent(form = ~ 1 | wave)
)

# Autoregressive:
opposites_naming_fit_ar <- update(
  opposites_naming_fit,
  correlation = corAR1(form = ~ 1 | id)
)

# Heterogeneous autoregressive:
opposites_naming_fit_hetar <- update(
  opposites_naming_fit_ar,
  weights = varIdent(form = ~ 1 | wave)
)

# Toeplitz:
opposites_naming_fit_toeplitz <- update(
  opposites_naming_fit,
  correlation = corARMA(form = ~ 1 | id, p = 3,q = 0)
)

opposites_naming_fits <- list(
  "Unstructured" = opposites_naming_fit_unstructured,
  "Compound symmetry" = opposites_naming_fit_compsymm,
  "Heterogeneous compound symmetry" = opposites_naming_fit_hetcompsymm,
  "Autoregressive" = opposites_naming_fit_ar,
  "Heterogeneous autoregressive" = opposites_naming_fit_hetar,
  "Toeplitz" = opposites_naming_fit_toeplitz
)

# Make table ------------------------------------------------------------------

# Table 7.3, page 258-259:
opposites_naming_fits |>
  map2(
    # Note that this list was made in the collapsed code chunk above. It just
    # contains the equations corresponding to each error covariance structure.
    hypothesized_varcov, 
    \(.fit, .hypothesized_varcov) {
      format_varcov <- function(x) {
        x <- round(getVarCov(x), digits = 1)
        begin <- "$$\\begin{bmatrix}"
        body <- apply(x, 1, \(.x) paste0(paste(.x, collapse = "&"), "\\\\"))
        end <- "\\end{bmatrix}$$"
        paste0(c(begin, body, end), collapse = "")
      }
      
      gof <- .fit |>
        glance() |>
        mutate(
          hypothesized_varcov = .hypothesized_varcov,
          "-2LL" = as.numeric(-2 * logLik),
          varcov = format_varcov(.fit),
          across(where(is.numeric), \(.x) round(.x, digits = 1))
        ) |>
        select(hypothesized_varcov, "-2LL", AIC, BIC, varcov)
    }
  ) |>
  list_rbind(names_to = "structure") |>
  gt() |>
  # Note: Math formatting in HTML currently requires gt version 0.10.1.9000
  # (development version).
  fmt_markdown(columns = c(hypothesized_varcov, varcov))
structure hypothesized_varcov -2LL AIC BIC varcov
Unstructured Σr=[σ12σ12σ13σ14σ21σ22σ23σ24σ31σ32σ32σ34σ41σ42σ43σ42]\mathbf{\Sigma}_r = \begin{bmatrix}\sigma_1^2 & \sigma_{12} & \sigma_{13} & \sigma_{14} \\ \sigma_{21} & \sigma_2^2 & \sigma_{23} & \sigma_{24} \\ \sigma_{31} & \sigma_{32} & \sigma_3^2 & \sigma_{34} \\ \sigma_{41} & \sigma_{42} & \sigma_{43} & \sigma_4^2 \end{bmatrix} 1255.8 1283.8 1324.6 [1345.11005.8946.2583.21005.81150.51028.5846.6946.21028.51235.8969.3583.2846.6969.31206]\begin{bmatrix}1345.1&1005.8&946.2&583.2\\1005.8&1150.5&1028.5&846.6\\946.2&1028.5&1235.8&969.3\\583.2&846.6&969.3&1206\\\end{bmatrix}
Compound symmetry Σr=[σ2+σ12σ12σ12σ12σ12σ2+σ12σ12σ12σ12σ12σ2+σ12σ12σ12σ12σ12σ2+σ12]\mathbf{\Sigma}_r = \begin{bmatrix} \sigma^2 + \sigma_1^2 & \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma^2 + \sigma_1^2 & \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma^2 + \sigma_1^2 & \sigma_1^2 \\ \sigma_1^2 & \sigma_1^2 & \sigma_1^2 & \sigma^2 + \sigma_1^2 \end{bmatrix} 1287.0 1299.0 1316.5 [1231.4900.1900.1900.1900.11231.4900.1900.1900.1900.11231.4900.1900.1900.1900.11231.4]\begin{bmatrix}1231.4&900.1&900.1&900.1\\900.1&1231.4&900.1&900.1\\900.1&900.1&1231.4&900.1\\900.1&900.1&900.1&1231.4\\\end{bmatrix}
Heterogeneous compound symmetry Σr=[σ12σ1σ2ρσ1σ3ρσ1σ4ρσ2σ1ρσ12σ2σ3ρσ2σ4ρσ3σ1ρσ3σ2ρσ32σ3σ4ρσ4σ1ρσ4σ2ρσ4σ3ρσ42]\mathbf{\Sigma}_r = \begin{bmatrix} \sigma_1^2 & \sigma_1 \sigma_2 \rho & \sigma_1 \sigma_3 \rho & \sigma_1 \sigma_4 \rho \\ \sigma_2 \sigma_1 \rho & \sigma_1^2 & \sigma_2 \sigma_3 \rho & \sigma_2 \sigma_4 \rho \\ \sigma_3 \sigma_1 \rho & \sigma_3 \sigma_2 \rho & \sigma_3^2 & \sigma_3 \sigma_4 \rho \\ \sigma_4 \sigma_1 \rho & \sigma_4 \sigma_2 \rho & \sigma_4 \sigma_3 \rho & \sigma_4^2 \end{bmatrix} 1285.0 1303.0 1329.2 [1438.1912.9946.61009.5912.91067.8815.7869.9946.6815.711489021009.5869.99021305.7]\begin{bmatrix}1438.1&912.9&946.6&1009.5\\912.9&1067.8&815.7&869.9\\946.6&815.7&1148&902\\1009.5&869.9&902&1305.7\\\end{bmatrix}
Autoregressive Σr=[σ2σ2ρσ2ρ2σ2ρ3σ2ρσ2σ2ρσ2ρ2σ2ρ2σ2ρσ2σ2ρσ2ρ3σ2ρ2σ2ρσ2]\mathbf{\Sigma}_r = \begin{bmatrix} \sigma^2 & \sigma^2 \rho & \sigma^2 \rho^2 & \sigma^2 \rho^3 \\ \sigma^2 \rho & \sigma^2 & \sigma^2 \rho & \sigma^2 \rho^2 \\ \sigma^2 \rho^2 & \sigma^2 \rho & \sigma^2 & \sigma^2 \rho \\ \sigma^2 \rho^3 & \sigma^2 \rho^2 & \sigma^2 \rho & \sigma^2 \end{bmatrix} 1265.9 1277.9 1295.4 [1256.71037.2856706.51037.21256.71037.28568561037.21256.71037.2706.58561037.21256.7]\begin{bmatrix}1256.7&1037.2&856&706.5\\1037.2&1256.7&1037.2&856\\856&1037.2&1256.7&1037.2\\706.5&856&1037.2&1256.7\\\end{bmatrix}
Heterogeneous autoregressive Σr=[σ12σ1σ2ρσ1σ3ρ2σ1σ4ρ3σ2σ1ρσ22σ2σ3ρσ2σ4ρ2σ3σ1ρ2σ3σ2ρσ32σ3σ4ρσ4σ1ρ3σ4σ2ρ2σ4σ3ρσ42]\mathbf{\Sigma}_r = \begin{bmatrix} \sigma_1^2 & \sigma_1 \sigma_2 \rho & \sigma_1 \sigma_3 \rho^2 & \sigma_1 \sigma_4 \rho^3 \\ \sigma_2 \sigma_1 \rho & \sigma_2^2 & \sigma_2 \sigma_3 \rho & \sigma_2 \sigma_4 \rho^2 \\ \sigma_3 \sigma_1 \rho^2 & \sigma_3 \sigma_2 \rho & \sigma_3^2 & \sigma_3 \sigma_4 \rho \\ \sigma_4 \sigma_1 \rho^3 & \sigma_4 \sigma_2 \rho^2 & \sigma_4 \sigma_3 \rho & \sigma_4^2 \end{bmatrix} 1264.8 1282.8 1309.1 [1340.71000.7857.3708.91000.71111.2952787.1857.39521213.31003.2708.9787.11003.21234]\begin{bmatrix}1340.7&1000.7&857.3&708.9\\1000.7&1111.2&952&787.1\\857.3&952&1213.3&1003.2\\708.9&787.1&1003.2&1234\\\end{bmatrix}
Toeplitz Σr=[σ2σ1σ2σ3σ1σ2σ1σ2σ2σ1σ2σ1σ3σ2σ1σ2]\mathbf{\Sigma}_r = \begin{bmatrix}\sigma^2 & \sigma_1 & \sigma_2 & \sigma_3 \\ \sigma_1 & \sigma^2 & \sigma_1 & \sigma_2 \\ \sigma_2 & \sigma_1 & \sigma^2 & \sigma_1 \\ \sigma_3 & \sigma_2 & \sigma_1 & \sigma^2 \end{bmatrix} 1258.1 1274.1 1297.4 [1246.91029.3896.66241029.31246.91029.3896.6896.61029.31246.91029.3624896.61029.31246.9]\begin{bmatrix}1246.9&1029.3&896.6&624\\1029.3&1246.9&1029.3&896.6\\896.6&1029.3&1246.9&1029.3\\624&896.6&1029.3&1246.9\\\end{bmatrix}

Comparing the deviance (-2LL), AIC, and BIC statistics for the alternative error covariance structures, we find that the unstructured and Toeplitz structures lead to the best-fitting models for the opposites_naming data.

Finally, we can see what is gained and lost when modelling the error covariance structure directly (instead of indirectly through random effects) by comparing fixed effect estimates and goodness-of-fit statistics from the unstructured and Toeplitz models with those of the “standard” multilevel model for change.

# Table 7.4, page 265:
opposites_naming_fit_standard |>
  list() |>
  set_names("Standard") |>
  c(keep_at(opposites_naming_fits, c("Toeplitz", "Unstructured"))) |>
  (\(.x) .x[c("Standard", "Toeplitz", "Unstructured")])() |>
  modelsummary(
    fmt = fmt_statistic(estimate = 2, statistic = 3),
    gof_map = list(
      list(
        raw = "logLik",
        clean = "Deviance",
        fmt = \(.x) vec_fmt_number(
          -2*as.numeric(.x), decimals = 1, sep_mark = ""
        )
      ),
      list(
        raw = "AIC",
        clean = "AIC",
        fmt = fmt_decimal(1)
      ),
      list(
        raw = "BIC",
        clean = "BIC",
        fmt = fmt_decimal(1)
      )
    ),
    output = "gt"
  ) |>
  tab_row_group(label = "Goodness-of-Fit", rows = 13:15) |>
  tab_row_group(label = "Variance Components", rows = 9:12) |>
  tab_row_group(label = "Fixed Effects", rows = 1:8)
Standard Toeplitz Unstructured
Fixed Effects
(Intercept) 164.37 165.10 165.83
(6.206) (5.922) (5.953)
time 26.96 26.90 26.58
(1.994) (1.943) (1.926)
I(baseline_cognitive_score - 113.4571) -0.11 0.00 -0.07
(0.504) (0.481) (0.483)
time × I(baseline_cognitive_score - 113.4571) 0.43 0.44 0.46
(0.162) (0.158) (0.156)
Variance Components
sd_(Intercept) 35.16
cor_time.(Intercept) -0.49
sd_time 10.36
sd_Observation 12.63
Goodness-of-Fit
Deviance 1260.3 1258.1 1255.8
AIC 1276.3 1274.1 1283.8
BIC 1299.6 1297.4 1324.6

Singer and Willett (2003) observe that for the opposites_naming data:

  • The Toeplitz model fits slightly better than the “standard” model on all accounts, but not enough to reject the “standard” model.
  • The unstructured model fits best if we focus exclusively on the deviance statistic, but at the cost of losing more degrees of freedom than any other error covariance structure considered here.
  • The fixed effects estimates are similar between the “standard”, Toeplitz, and unstructured models (except for I(baseline_cognitive_score - 113.4571)), but the precision of these estimates is slightly better for the Toeplitz and unstructured models, which better represent the error covariance structure of the data.

Thus, for this data, we conclude that not much is gained by replacing the “standard” multilevel model for change with any of the extended linear models for change explored here.

However, in other data sets, the magnitude of difference between these modelling approaches may be greater (depending on the study design, statistical model, choice of error covariance structure, and the nature of the phenomenon under study), which may lead us to prefer the extended linear model for change—if our inferential goal exclusively involves population-averaged interpretations of fixed effects, and we are not at all interested in addressing questions about individuals via random effects (For discussion, see McNeish, Stapleton, & Silverman, 2017; Muff, Held, & Keller, 2016).