Robustness fix to the CPI reader (debt D-7.2, settled in the OU-hierarchical Session 10).
read_cpi() now locates the date and CPI columns with
which() instead of which.max().
which.max() on an all-FALSE
str_detect() (no column matched) silently returns index 1,
a false positive that pointed read_cpi at the first column;
it also masked ambiguity by taking only the first of several matches.
which() returns every matching index (or
integer(0)), so the existing length(col) != 1
guard now errors honestly on both no-match and multiple-match. Behaviour
on a well-formed CPI file is unchanged (verified against the bundled
extdata/CPI.xlsx: 1960-2023, columns
Year/CPI).Complete redesign to a genuinely evidence-based method. The aggregate index now enters the estimation as a real observation density; the sectoral indices come out as a posterior with credible intervals.
disaggregate_statespace() — canonical
engine. A Bayesian state-space model: a random-walk-with-drift
transition in log phi (partial pooling on the drift and the
innovation scale), an estimable cross-sectional concentration, and a
Student-t (or Gaussian) observation
cpi_t ~ (nu, sum_k W[t,k] phi[t,k], sigma). Sampled by
HMC (cmdstanr or rstan). Returns a [T, K, draws] array of
posterior draws of phi — exactly the multiple-imputation
input consumed by bayesianOU::fit_ou_nested_mi() — plus
credible bands and diagnostics.disaggregate_conjugate() — closed-form Bayesian
baseline. The exact linear-Gaussian posterior (Kalman filter +
RTS smoother), MCMC-free, with joint posterior draws via the
Durbin-Koopman simulation smoother. This is the correct realization of
the package’s original “MCMC-free posterior” aspiration: it does
condition on the aggregate evidence in closed form.Helpers: simulate_disagg() (the model’s own DGP, for
recovery and examples), align_disagg_inputs() and
disaggregate_from_files() (read + align CPI and VAB-weight
Excel files), disagg_default_priors(),
disagg_stan_code().
The 0.1.2 “deterministic Bayesian” family never conditioned on the
aggregate CPI (F1): the posterior was derived from the prior weight
matrix alone, several pieces cancelled on renormalization (Dirichlet
concentration F2, temporal pattern F3), the “efficiency” term was a
fixed constant (F4), there were no recovery tests (F5), and
robust_cor opportunistically picked the larger correlation
(F6). Because that foundational defect cannot be fixed without turning
the method into the new evidence-based engine, the deterministic blend
was retained for one design cycle as a baseline and then removed
entirely (it added nothing the two Bayesian engines do not do,
honestly). Deleted: bayesian_disaggregate(),
posterior_weighted/multiplicative/dirichlet/adaptive(),
compute_L_from_P(), spread_likelihood(),
coherence_score(), numerical_stability_exp(),
temporal_stability(), stability_composite(),
interpretability_score(), run_grid_search(),
save_results(), and the
robust_cor/kl_divergence/ total_variation/safe_div
utilities.
generate_quantities on frozen draws (isolating CSV
serialization rounding).evidence-based-disaggregation documenting
the model, the two engines, the F1–F6 rationale and the coupling to the
nested OU.DESCRIPTION no longer claims a “novel/original”
contribution (anti-overreach).