The Empirical Bayes Variational Autoencoder – A Neural ODE Approach for Population Modeling in Pharmacology - PAGE Meeting (Population Approach Group Europe)

Marcus Baaz ¹, Anders Sjöberg ^1,2, Mats Jirstrand ¹

1 Fraunhofer Chalmers Centre (Gothenburg, Sweden), 2 Department of Electrical Engineering, Chalmers University of Technology (Gothenburg, Sweden)

Objectives
Population modeling using nonlinear mixed-effects (NLME) frameworks is fundamental to pharmacometrics, enabling rigorous quantification of inter- and intra-individual variability in pharmacokinetics (PK) and pharmacodynamics (PD). Estimation algorithms such as the stochastic approximation expectation–maximization (SAEM), implemented in widely used platforms including Monolix and NONMEM, provide robust inference but often use predefined structural models and parametric assumptions.
Recent advances in neural ordinary differential equations (neural ODEs) extend deep learning to continuous-time dynamical systems [1], offering flexible and data-driven representations of temporal processes. So-called variational autoencoders (VAEs) [2] provide a scalable alternative to sampling-based inference, e.g., SAEM, by replacing iterative posterior estimation with a parameterized posterior learned jointly with the model.
In this context, the objectives of the present study are:
1. To formalize the conceptual relationship between SAEM and VAE-based population modeling.
2. To assess, through simulation studies, the impact of an empirical Bayes prior on the recovery of correlated latent effects and covariate relationships.
3. To evaluate predictive performance on clinical PK data relative to a previously published neural ODE benchmark [3].
Methods
We formulate a population neural ODE within an encoder–decoder VAE architecture. Inter-individual variability (IIV) is represented by a Gaussian vector partitioned into initial-condition and dynamic components. An encoder network approximates the individual posterior over latent parameters. The decoder comprises a neural ODE governing latent dynamics, a linear observation mapping to the concentration domain, and a Gaussian additive–proportional residual error model.
Departing from standard VAEs with fixed Gaussian priors, we introduced an empirical Bayes population prior parameterized as a covariate-dependent multivariate Gaussian distribution with learnable mean and covariance. This prior captures correlated random effects and systematic covariate influences.
Two experimental settings were considered. A simulation study with a large synthetic cohort evaluated recovery of known latent correlations and covariate effects under both VAE formulations. A clinical PK case study, previously analyzed using both traditional compartment models and low-dimensional neural ODEs, assessed predictive performance under truncated observation windows. In all experiments, low-dimensional latent representations were used to enforce a strong information bottleneck and limit overfitting.
Results
We show that quantities central to NLME modeling, such as population parameters and empirical Bayes estimates (EBEs), have natural counterparts within this framework. Moreover, the likelihood is formulated analogously, relying on marginalization over latent random effects.
In simulation studies, the empirical Bayes VAE accurately recovered the correlated latent IIV structures and covariate-dependent population effects, whereas the standard VAE with a fixed Gaussian prior failed to reproduce the true variability when individual parameters were correlated. The model generalized well to dosing regimens not seen during training, provided they were close to those observed during training. In contrast, the fixed-prior formulation showed biased population predictions and distorted latent representations.
In the clinical PK case study, diagnostic assessments indicated satisfactory goodness-of-fit and well-calibrated predictive variability. Across 100 random 70/30 cross-validation splits, predictive performance evaluated on held-out individuals (unseen in training) showed a median absolute prediction error of 0.42 for the empirical Bayes VAE compared with 0.57 for the previously published neural ODE benchmark [3].

Conclusions
Incorporating an empirical Bayes prior within the VAE–neural ODE framework substantially improves recovery of correlated IIV and covariate effects compared with a fixed Gaussian prior. By learning the population mean and covariance from the data, the model preserves the hierarchical structure of classical NLME approaches while leveraging scalable gradient-based optimization, without altering the evidence lower bound objective, thereby maintaining probabilistic coherence and interpretable uncertainty. Simulation studies highlight that a flexible population prior is critical for identifiable and unbiased inference of correlated latent parameters and covariate effects. In the clinical application, the empirical Bayes VAE successfully modeled a small real-world PK dataset, achieving predictive performance comparable to both a previously published neural ODE implementation and traditional NLME compartment models.

Overall, the empirical Bayes VAE provides a scalable extension of classical population modeling that remains interpretable in terms of established NLME quantities, combining hierarchical Bayesian structure with flexible neural dynamics. The approach is particularly promising in settings where traditional parametric assumptions may be restrictive, for example in the presence of multimodal data, while preserving principled population-level inference and uncertainty quantification. In addition, it has the potential to facilitate automated data-driven workflows when the data are sufficiently informative, without requiring manually specified structural models.

References:
[1] Chen RTQ, Rubanova Y, Bettencourt J, Duvenaud D. Neural Ordinary Differential Equations 2019. https://doi.org/10.48550/arXiv.1806.07366.
[2] Kingma DP, Welling M. An Introduction to Variational Autoencoders. Found Trends® Mach Learn 2019;12:307–92. https://doi.org/10.1561/2200000056.
[3] Bräm DS, Steiert B, Pfister M, Steffens B, Koch G. Low-dimensional neural ordinary differential equations accounting for inter-individual variability implemented in Monolix and NONMEM. CPT Pharmacomet Syst Pharmacol 2025;14:5–16. https://doi.org/10.1002/psp4.13265.

Reference: PAGE 34 (2026) Abstr 12305 [www.page-meeting.org/?abstract=12305]

Poster: Methodology - New Modelling Approaches