I-094

Post-hoc model joining with Normalizing Flows for efficient and scalable multi-endpoint PKPD

Niklas Korsbo 1, Lorenzo Contento, Mohamed Tarek

1 PumasAI (Dover, USA)

Objectives

Utilizing longitudinal biomarkers to improve primary endpoint predictions is a common objective in clinical pharmacology, yet traditional joint NLME approaches become computationally intractable and workflow-prohibitive beyond 3-4 endpoints. FOCE-based joint estimation becomes impractical for the dozens of endpoints needed in real-world applications (e.g., 10+ biomarker models), and even when smaller joint models can be fitted, they create workflow paralysis: tangled iteration cycles, slow fits, and poor plannability across teams. We present a post-hoc model joining approach using Normalizing Flows (NFs) that addresses this core barrier by enabling modular workflows where endpoints are fitted independently and subsequently joined via learned joint priors over random effects [1,2,3]. We demonstrate near-optimal cross-endpoint information transfer in a realistic multi-endpoint setting, suggesting feasibility for the dozens of endpoints encountered in practice.

Methods

We simulated a four-endpoint system: oral one-compartment PK driving three PD endpoints (Primary indirect response, AST and ALT turnover models) with six correlated PD random effects (two per endpoint). Data: 200 training, 50 tuning, 500 test subjects.

Models were fitted sequentially: PK first, then each PD endpoint independently with fixed PK population parameters but propagated PK random effect variance. The three resulting sub-models were combined into a joint model with initially independent PD random effect priors. We then trained a normalizing flow on Laplace-approximated posteriors from the training data to learn a flexible joint prior distribution over the six PD random effects, capturing correlations across endpoints. This learned joint prior replaced the original independent priors, yielding a fully joint NLME model despite never jointly fitting the structural components. Architecture selection: grid search over 1-3 coupling layers, 1-3 hidden layers, and five random seeds per configuration, selected by median tuning-set improvement.

Performance was evaluated via ΔLL: the improvement in total test-set log-likelihood (all observations including PK) relative to the independent-prior baseline, measured in nats (n=500 test subjects). The oracle ceiling was the data-generating model with true correlation structure and true population parameters.

Results

The NF-joined model achieved ΔLL = 337.6 ± 0.05 nats, representing 77.3% of the oracle ceiling (DGM with true Ω and true θ: ΔLL = 436.7 nats). Notably, this recovery was achieved despite the biomarkers having weak linear predictive power for the primary endpoint (R² = 8.1% for AST+ALT jointly predicting primary random effects in the data-generating model), demonstrating the method’s effectiveness at extracting even weak dependency structures using a sequential modeling approach.

To quantify cross-endpoint information transfer, we computed empirical Bayes estimates of primary-endpoint random effects from biomarker data alone (no primary observations). The NF model’s biomarker-derived primary EBEs closely matched those of the oracle DGM (mean R² = 0.82), demonstrating that the learned NF prior recovers nearly the same information transfer as the true correlation structure. Under the independent model, no such transfer occurred. Primary endpoint time-course predictions from biomarker data alone confirmed this: the NF and DGM models’ individualized predictions closely tracked each other.

Conclusions

Post-hoc model joining via normalizing flows recovered 77% of the oracle ceiling in this proof-of-principle, demonstrating that modular fitting can achieve substantial cross-endpoint information transfer even when correlations are weak. Combined with the theoretical generality of post-hoc prior replacement, these results suggest that the approach can scale to the dozens of endpoints encountered in practice, where monolithic joint models are infeasible. Critically, modular workflows replace one unpredictable large task with many smaller, independently tractable ones, making timelines estimable and enabling use within the fixed decision schedules of clinical development. Teams can iterate on endpoints in parallel, reuse models from prior studies (with the NF adapting for population differences), and require no manual correlation specification; the joint distribution is learned from data. Flexible prior refitting via neural density estimation provides a general tool for multi-endpoint pharmacometrics.

References:
[1] Dinh L, Sohl-Dickstein J, Bengio S. Density estimation using Real NVP. ICLR 2017. [2] Papamakarios G, Nalisnick E, Rezende DJ, Mohamed S, Lakshminarayanan B. Normalizing Flows for Probabilistic Modeling and Inference. JMLR 2021;22(57):1-64. [3] Contento L., Tarek M. Improving simulations by learning the true random effects’ distribution from a population using generative machine learning. PAGE 32 (2024) Abstr 11038 [www.page-meeting.org/?abstract=11038].

Reference: PAGE 34 (2026) Abstr 11880 [www.page-meeting.org/?abstract=11880]

Poster: Methodology - New Modelling Approaches