Heterogeneity and person-specific effects

Estimands: population, partial pooling, and no pooling

Mixed models estimate population fixed effects and variance components for random effects. Person-specific quantities are usually summarized by conditional modes (empirical Bayes / BLUPs in lme4):

These are not the same as fitting a separate regression in each person (no pooling), which ild_person_model() implements for teaching and idiographic workflows. No-pooling estimates do not shrink and can be very unstable with few observations per person.

Bayesian fits from ild_brms() report person-specific summaries via posterior distributions; ild_heterogeneity() reads coef(fit, summary = TRUE) for posterior means and intervals.

ild_heterogeneity()

After ild_lme() or ild_brms() with random effects, use:

library(tidyILD)
d <- ild_simulate(n_id = 20, n_obs_per = 10, seed = 7)
x <- ild_prepare(d, id = "id", time = "time")
x <- ild_center(x, y)
fit <- ild_lme(y ~ y_wp + y_bp + (1 | id), data = x)
#> Temporal autocorrelation is not modeled (ar1 = FALSE). Consider ar1 = TRUE for ILD.
#> Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
#> Model failed to converge with max|grad| = 3.88101 (tol = 0.002, component 1)
#> Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
#>  - Rescale variables?
h <- ild_heterogeneity(fit)
print(h$summary)
#> # A tibble: 1 × 11
#>   group term        n_levels mean_total sd_total prop_gt_zero      q25      q50
#>   <chr> <chr>          <int>      <dbl>    <dbl>        <dbl>    <dbl>    <dbl>
#> 1 id    (Intercept)       20   1.25e-15 7.62e-19            1 1.25e-15 1.25e-15
#> # ℹ 3 more variables: q75 <dbl>, prop_gt_threshold <dbl>, varcorr_sdcor <dbl>
head(ild_tidy(h))
#> # A tibble: 6 × 9
#>   group level_id term        estimate_ranef estimate_total std_error conf_low
#>   <chr> <chr>    <chr>                <dbl>          <dbl>     <dbl>    <dbl>
#> 1 id    1        (Intercept)      -5.92e-18       1.25e-15  2.66e-17 1.20e-15
#> 2 id    2        (Intercept)      -3.95e-18       1.25e-15  2.66e-17 1.20e-15
#> 3 id    3        (Intercept)      -3.95e-18       1.25e-15  2.66e-17 1.20e-15
#> 4 id    4        (Intercept)      -3.95e-18       1.25e-15  2.66e-17 1.20e-15
#> 5 id    5        (Intercept)      -3.70e-18       1.25e-15  2.66e-17 1.20e-15
#> 6 id    6        (Intercept)      -4.20e-18       1.25e-15  2.66e-17 1.20e-15
#> # ℹ 2 more variables: conf_high <dbl>, estimand <chr>

The $summary table includes the proportion of person-specific total coefficients greater than zero, quantiles, and (for lmer) joins VarCorr standard deviations when names align.

Optional threshold and scale = c("raw", "sd_x", "sd_y") define substantively motivated cutoffs for the proportion exceeding a threshold (e.g. a fraction of the SD of \(x\) or \(y\)).

Diagnostics bundle and plots

ild_diagnose() stores fit$heterogeneity when extraction succeeds. Plot with:

ild_autoplot(bundle, section = "fit", type = "heterogeneity", term = "y_wp")

Use heterogeneity_type = "histogram" in addition if you prefer a histogram (passed as ...).

Guardrails GR_RE_SLOPE_VARIANCE_VERSUS_RESIDUAL_LOW and GR_PERSON_SPECIFIC_SLOPES_EMPIRICALLY_TIGHT flag cases where estimated slope heterogeneity is small relative to residual noise (heuristic interpretation aids).

Stratified descriptive comparison

ild_heterogeneity_stratified() refits the same formula within levels of a grouping column and binds per-subgroup summaries. This is a descriptive tool, not a formal test of differences in variance components; use with adequate \(N\) per subgroup (min_n_id).

ild_heterogeneity_stratified(
  y ~ y_wp + (y_wp | id),
  data = x,
  subgroup = "cohort",
  min_n_id = 8L
)

See also

vignette("developer-contracts", package = "tidyILD") documents the optional fit$heterogeneity slot on ild_diagnostics_bundle.