This vignette embeds the developer contract shipped
with the package. Use it when adding engines (KFAS, ctsem, …) or
evolving ild_tidy() / ild_augment() /
ild_diagnose(). For a code skeleton (S3
stubs, bundle assembly, guardrails, provenance, tests), see
inst/dev/backend-adapter-template.R in the source tree.
Status: Normative for new backends and API
evolution. User-facing summaries live in
?ild_diagnostics_bundle, ?ild_tidy_schema, and
?ild_augment_schema; this document is the
operational specification for implementers.
Version: Align with tidyILD package
version; schema constants live in
R/ild_diagnostics_bundle.R,
R/ild_schema_tidy_augment.R, and
R/ild_guardrail_registry.R.
ild_diagnostics_bundle)c("ild_diagnostics_bundle", "list").meta, data, design,
fit, residual, predictive,
missingness, causal, warnings,
guardrails, summary_textILD_DIAGNOSTICS_BUNDLE_SLOTS in source).validate_ild_diagnostics_bundle() — list names must match
exactly; warnings and guardrails must be
tibbles; guardrails must include the columns in §1.10;
summary_text must be character.meta …
causal may be NULL if not applicable for an
engine or run. warnings and
guardrails are never NULL
(use empty tibbles).meta| Field | Required | Type | Semantics |
|---|---|---|---|
engine |
Recommended | character | Backend id: "lmer", "lme",
"brms", or future ids (e.g. "kfas"). |
n_obs, n_id |
Recommended | integer | Rows and distinct persons on the ILD used for diagnostics. |
ar1 |
Optional | logical | Whether residual AR1/CAR1 was used (frequentist path). |
brms_types |
Optional | character | Subset of diagnostics requested for brms (convergence,
sampler, ppc). |
Additional named entries are allowed if documented per engine.
dataPurpose: Observation-level and cohort-level data quality (spacing, gaps, global missingness pattern).
| Field | Required | Semantics |
|---|---|---|
summary |
Recommended | One-row tibble from ild_summary()$summary
(e.g. n_id, n_obs, prop_gap,
median_dt_sec, iqr_dt_sec). |
spacing_class |
Recommended | "regular-ish" or "irregular-ish"
(ild_spacing_class()). |
n_gaps, pct_gap |
Recommended | Gap counts / percent of intervals flagged as gaps. |
median_dt_sec, iqr_dt_sec |
Optional | Timing dispersion. |
time_range |
Optional | Numeric range of .ild_time_num or equivalent. |
cohort |
Recommended | list(n_id, n_obs, n_intervals) aligned with
ild_summary. |
obs_per_id |
Recommended | list(min, median, max, sd) observation counts per
person (occasion imbalance). |
outcome_summaries |
Optional | Tibble per outcome column: mean, sd,
min, max, pct_na, n
(when model response/predictors passed to
ild_diagnose). |
missingness_rates |
Optional | Tibble variable, pct_na for those
columns. |
missing_pattern |
Optional | List: summary, overall,
n_complete from ild_missing_pattern() (global
columns). |
designPurpose: Design structure (WP/BP, imbalance, design-time missingness).
| Field | Required | Semantics |
|---|---|---|
ild_design_check |
Recommended | Full return value of ild_design_check() (embedded list
object). |
spacing_class, recommendation |
Optional | Redundant copies acceptable for fast reporting. |
wp_bp |
Optional | Decomposition tibble or NULL if vars
omitted. |
design_missingness |
Optional | Summaries bundled inside ild_design_check / design
slice. |
flags |
Recommended | list(has_wp_bp, spacing_class, irregular) for fast
checks. |
time_coverage |
Recommended | list(min, max, span_sec) from pooled timeline. |
occasion_imbalance |
Recommended | Same object as data$obs_per_id (mirrored for
visibility). |
fitPurpose: Estimation diagnostics (convergence, singularity, MCMC).
Frequentist lmerMod:
engine, singular, converged,
optimizer_messages, optinfo (lmer);
reml,
optimizer,
theta_summary
(length/min/max of
variance-component vector), X_rank
(fixed-effects design rank from qr(X)),
residual_correlation
(list(modeled, structure, note) — lmer has no AR1/CAR1;
modeled reflects attr(ild_ar1)).
Frequentist lme: above where
applicable; apVar_ok;
residual_correlation with
class (ild_correlation_class)
and coef_corStruct when present.
Bayesian (brmsfit):
engine, ild_posterior,
convergence_table, max_rhat,
n_divergent, n_max_treedepth_hits;
residual_correlation (note
only; residual structure is model-specific).
Optional heterogeneity (lmerMod,
lme, brmsfit):
list(available, reason, object). When
available is TRUE, object is an
ild_heterogeneity result from [ild_heterogeneity()]
(person-specific partial-pooling summaries). When extraction fails
(e.g. intercept-only fixed model without random effects, or parsing
error), available is FALSE and
reason is a short message.
ild_autoplot(bundle, section = "fit", type = "heterogeneity", ...)
forwards term / heterogeneity_type to
[ild_autoplot.ild_heterogeneity()].
residualPurpose: Residual behavior (ACF, Q-Q, vs time/fitted).
| Field | Required | Semantics |
|---|---|---|
legacy_ild_diagnostics |
Optional | Full ild_diagnostics() object for frequentist engines
(enables plot_ild_diagnostics() /
ild_autoplot(bundle, section = "residual", type = NULL)
multi-plot or short type names acf /
qq / fitted). |
residual_sd, cor_observed_fitted |
Optional | Scalar summaries. |
engine |
Recommended | Engine id. |
Bayesian fits may omit legacy_ild_diagnostics and keep
lighter summaries only.
predictivePurpose: Predictive checks (obs vs fitted, PPC).
| Content | Semantics |
|---|---|
| Frequentist | engine, n, mean_abs_error,
rmse, mean_residual (bias),
max_abs_error, cor_observed_fitted. |
| brms | ppc from ild_brms_ppc_summary() when
requested; same observation-level scalars as frequentist when
ild_augment succeeds (n, MAE, RMSE,
etc.). |
missingnessPurpose: Variable-level missingness
aligned to model terms (not duplicate of global
data$missing_pattern when both exist).
| Field | Semantics |
|---|---|
note |
Set when no model variables; else NULL. |
summary, overall |
From ild_missing_pattern(data, vars = predictors) when
predictors exist. |
pct_na_by_var |
Subset of summary (variable,
pct_na) when available. |
missing_model_diagnostic |
Optional: tidy, message,
outcome, predictors when
ild_diagnose(..., missing_model = TRUE) runs
ild_missing_model(). |
causalPurpose: Causal / weighting diagnostics (IPW, positivity).
Typical content: columns_found,
weight_summary (min, max,
mean) when .ipw or related columns exist on
ild_data. Optional
weight_detail (quantiles,
sum_w) when
ild_diagnose(..., causal_detail = TRUE).
warnings and guardrailswarnings (event log from software /
sampler):
| Column | Semantics |
|---|---|
source |
e.g. "lme4", "stan",
"tidyILD". |
level |
e.g. "warning", "note". |
message |
Human-readable text. |
code |
Short machine id: e.g. lmer_warning,
divergent_transitions. |
guardrails (methodological rules; see
?guardrail_registry):
| Column | Semantics |
|---|---|
rule_id |
Stable id (e.g. GR_SINGULAR_RANDOM_EFFECTS). |
section |
Bundle section the rule relates to (fit,
design, data, …). |
severity |
e.g. info, warning. |
triggered |
Always TRUE for rows that appear (only triggered rules
are stored). |
message, recommendation |
Run-specific or default text. |
Surfacing guardrails (package identity):
print() on
ild_diagnostics_bundle: After the slot listing,
prints a Summary block: warning row count, guardrail
row count, and (when nrow(guardrails) > 0)
highest severity (info <
warning) and up to five unique
rule_id values. Helpers:
guardrail_severity_rank() /
guardrail_max_severity() in
R/ild_guardrail_registry.R.ild_report(): On successful
ild_diagnose(), diagnostics_summary includes
guardrails_narrative (short string) and
guardrails (n,
max_severity, rule_ids). If any guardrails
fired, methods_with_guardrails repeats the
methods paragraph with the same guardrail sentence appended (equivalent
to ild_methods(fit, bundle = diagnose_result)).ild_methods(..., bundle = NULL):
Optional bundle from ild_diagnose(fit); when
nrow(bundle$guardrails) > 0, appends one sentence after
provenance: Methodological cautions (tidyILD guardrails):
…rule_id added to
ILD_GUARDRAIL_REGISTRY must have a deterministic
regression test that asserts the rule can still be triggered
(see tests/testthat/test-guardrails-triggers.R and
fit-level tests via evaluate_guardrails_fit() with a
constructed fit_diag when full ild_diagnose()
is impractical).Contract regression (semantics, not only shape):
tests/testthat/helper-contract-fixtures.R builds stable
seeded scenarios (regular vs irregular spacing, IPW instability, late
dropout with merged contextual guardrails, merged singular / brms
fit-level guardrails).
tests/testthat/test-contract-regression.R checks dense
bundle sections, expected rule_ids, ild_tidy /
ild_augment required columns, core
ild_autoplot() routes, and guardrail-aware
ild_methods(..., bundle) text. The brms scenario uses
skip_on_cran() because it fits a small brms
model.
summary_textCharacter vector of short narrative lines for reporting
(ild_report, printing). Not a substitute for structured
sections.
ild_tidy())Schema function: ild_tidy_schema() —
constants ILD_TIDY_REQUIRED_COLS,
ILD_TIDY_OPTIONAL_COLS.
term, component, effect_level,
estimate, std_error, conf_low,
conf_high, statistic, p_value,
interval_type, engine,
model_class
Frequentist (tidy_ild_model):
interval_type is "Wald";
statistic is the model-reported t (or
z-ratio when se = "robust").
brms: interval_type is
"quantile" (equal-tailed default intervals);
p_value is NA; optional posterior columns
filled when intervals = TRUE.
rhat, ess_bulk, ess_tail,
pd, rope_low, rope_high (Bayesian
/ posterior summaries).
componentConservative vocabulary (extend with engine-specific values only if documented):
| Value | Meaning |
|---|---|
fixed |
Standard fixed-effect regression coefficients (current default for all exposed FE rows). |
random |
Random-effect variance / covariance summaries when exposed as rows. |
auxiliary |
Dispersion, residual variance, or residual-correlation parameters
(not yet exposed for lme4/nlme in ild_tidy). |
scale |
Dispersion parameters (legacy alias in docs; prefer
auxiliary where appropriate). |
correlation |
Correlation structure parameters (e.g. AR1). |
nonlinear |
Smooth / spline terms (e.g. mgcv-style). |
effect_level| Value | Meaning |
|---|---|
population |
Intercept / cohort-mean type estimands ((Intercept)
rows). |
within |
Term name ends with _wp (within-person component). |
between |
Term name ends with _bp (between-person
component). |
cross_level |
Interaction term involving both _wp and
_bp substrings. |
unknown |
Coefficient rows that are not clearly classified (default for generic predictors). |
auxiliary |
Variance / sampler / state parameters (when such rows exist). |
interval_type| Value | Meaning |
|---|---|
Wald |
Symmetric interval from SE (normal or t as implemented). |
quantile |
Equal-tailed posterior quantiles. |
HPD |
Highest posterior density interval. |
normal |
Alias for large-sample normal approximation when distinct from
Wald. |
ild_augment())Schema function: ild_augment_schema() —
ILD_AUGMENT_REQUIRED_COLS,
ILD_AUGMENT_OPTIONAL_COLS.
.ild_id, .ild_time, .outcome,
.fitted, .resid, .resid_std,
engine, model_class
.resid_std semantics (principled but
sparse): set from residuals(fit, type = "pearson")
when the engine returns a numeric vector of the correct length;
otherwise all NA. Do not populate with
arbitrary z-scores of .resid in the same column (avoids
mixing definitions). For brms, Pearson residuals are
used when available; otherwise NA.
.fitted_lower, .fitted_upper
(e.g. equal-tailed 95% for brms
ild_augment), .influence, .state,
.state_lower, .state_upper (e.g. latent
states; often NA placeholders).
| Pattern | Reserved for |
|---|---|
.ild_* |
ILD system columns from ild_prepare() (do not overwrite
for non-ILD semantics). |
.fitted* |
Fitted / predicted mean response. |
.resid* |
Residuals (raw or transformed). |
.state* |
Latent trajectory / random-effect line values when exposed per row. |
The observed response is always
.outcome (no duplicate formula-named
column in the augmented tibble).
Skeleton:
inst/dev/backend-adapter-template.R — commented R patterns
for ild_tidy, ild_augment,
ild_diagnose, guardrails, provenance, tests, and docs (not
sourced by the package; copy into R/ when
implementing).
Each new estimation backend (e.g. KFAS, ctsem) should ship:
| Generic | Requirement |
|---|---|
ild_tidy() |
Return a tibble aligned with
ild_tidy_schema() (required columns when
feasible). |
ild_augment() |
Return a tibble aligned with
ild_augment_schema(); attach
attr(..., "ild_data") on the model object
used by diagnostics. |
ild_diagnose() |
Return ild_diagnostics_bundle only;
populate all relevant sections; use shared fillers/helpers where
possible (fill_diagnostics_*, guardrail evaluation). |
ild_autoplot() |
For bundle: section-first routing
(section + type); implement plotters that read
bundle slots; call engine-specific code only inside plotters. Attach
attr(bundle, "ild_fit") and
attr(bundle, "ild_data") from ild_diagnose()
so PPC, fitted, and missingness plots work without a second user
argument. |
attr(fit, "ild_data") (validated ILD) for
ild_augment / ild_diagnose.attr(fit, "ild_provenance") via
ild_new_analysis_provenance() (or equivalent) with
step set to the fitting function name
(e.g. ild_kfas), serializable args, and
outputs summarizing key choices.expect_named() /
expect_true(all(ild_tidy_schema()$required %in% names(...)))
for tidy output (or explicit waiver for transitional columns).expect_s3_class(..., "ild_diagnostics_bundle");
expect_no_error(validate_ild_diagnostics_bundle(...)) when
constructed via ild_diagnose().ild_tidy, ild_augment,
ild_diagnose run without error (use
skip_if_not_installed for heavy deps).@rdname as
existing engines where possible; describe engine-specific
fit / residual slots in
Details.Status: Phase 1 — provenance on input; best-effort
round-trip via ild_as_tsibble().
tbl_ts
inputs must have exactly one key variable. Multiple
keys (e.g. crossed factors) are not supported; reshape
or paste into a single id before ild_prepare().data is a
tbl_ts, ild_prepare() stores metadata in
attr(x, "tidyILD")$tsibble (key/index names, interval
summary string, is_regular, tsibble version). The
tbl_ts class is still dropped on output; this records
formal time-series semantics for reporting and for
ild_as_tsibble().ild_spacing$tsibble (when
present) links declared tsibble regularity/interval text to empirical
.ild_dt summaries; they may diverge after sorting,
duplicate handling, or row drops.ild_as_tsibble() passes
regular = from stored is_regular when
available so interval() often matches the original for
unchanged data.Normative spec:
inst/dev/BACKEND_VALIDATION_BENCHMARK_CONTRACT.md (scenario
IDs, tiers, metric columns, artifact layout).
Purpose: Run shared simulation scenarios across
lme4 / nlme / brms / KFAS / ctsem
entry points (where installed), write benchmark_raw.csv,
benchmark_summary.csv, and
benchmark_metadata.json, and optionally gate regressions
with JSON thresholds in
inst/benchmarks/thresholds-*.json.
Implementation (not exported API):
| Location | Role |
|---|---|
tests/testthat/helper-backend-validation-harness.R |
harness_run_benchmark(), metric extraction,
summarization, threshold evaluation. |
scripts/run-backend-validation-benchmarks.R |
CLI runner (--tier, --backends,
--n-sim, --seed, --out-dir).
Expects package root + pkgload::load_all() or
devtools::load_all(). |
scripts/check-backend-validation-thresholds.R |
Reads benchmark_summary.csv + thresholds JSON; writes
benchmark_checks.csv; exit code 1 on hard failures. |
.github/workflows/backend-validation-benchmarks.yml |
Scheduled / manual CI; uploads artifacts. |
When adding a new backend: extend the scenario
manifest and harness_fit_one() in the harness helper;
document the scenario here and in the benchmark contract; add or adjust
skip_if_not_installed() tests in
tests/testthat/test-backend-validation-harness.R. Prefer
warn-only thresholds for slow or fragile optional
engines until metrics stabilize.
User-facing functions (see
vignette("temporal-dynamics-model-choice", package = "tidyILD")):
| Function | Role |
|---|---|
ild_panel_lag_prepare() |
Multi-variable ild_lag() + single
ild_check_lags(); provenance step
ild_panel_lag_prepare. |
ild_compare_fits() |
Named list of fits → tibble (aic, bic,
n_obs, converged, optional
n_guardrails); not an automatic
nested-model test. |
ild_brms_dynamics_formula() |
Returns a suggested formula + notes for
ild_brms(); does not fit. |
Guardrails (registry +
evaluate_guardrails_contextual()):
GR_LAG_MEAN_STRONG_RESIDUAL_ACF_NO_AR — lag terms in
mean, no residual AR, pooled residual ACF at lag 1 above a fixed
threshold (heuristic).GR_INDEX_LAG_IRREGULAR_SPACING —
ild_spacing_class irregular-ish and data provenance records
ild_lag(..., mode = "index").help(ild_diagnostics_bundle),
help(ild_tidy_schema),
help(ild_augment_schema),
help(guardrail_registry).system.file("CONTRACTS.md", package = "tidyILD") (if
installed from source with inst/).