Regularized estimation in high-dimensional mechanistic models: application to vaccine development

Mélanie Prague 1,2, Lisa Crépin 1,2, Morgan Craig 3, Cécile Proust Lima 1

1 Univ. Bordeaux Inserm Inria (Bordeaux, France), 2 Vaccine research Institute (Créteil, France), 3 Université de Montreal (Montreal, Canada)

Objectives : Mechanistic models are widely used to describe and explain biological processes over time [1]. However, they typically rely on a limited number of observable compartments and sparse longitudinal data. As a result, these models are often either too simple to capture complex biological phenomena, such as immune response dynamics following vaccination, or they face identifiability issues [2], particularly when considering interindividual variability in the form of nonlinear mixed-effects models based on systems of differential equations. In parallel, with ongoing technological advances, longitudinal high-throughput data (e.g., omics, including transcriptomics and proteomics data) are increasingly available in various contexts and could bring valuable information into mechanistic models to better capture underlying biological processes. However, when considering complex models with multiple unobserved compartments, integrating such high-dimensional data to inform the dynamics of unobserved biological compartments remains a major challenge, both mathematically and for broader interpretation purposes.

Methods : We hypothesize that observed -omics biomarkers can be used to infer and explain the dynamics of unobserved immune compartments. This hypothesis is reasonable in the line of recent work on deconvolution methods that make it possible to infer cell abundance from transcriptomic data. Our goal is therefore to find biomarkers that can accurately
translate the dynamics of these compartments. Here, we propose an estimation and regularization method [3] for mechanistic models that involve multiple unobserved compartments, measured by high-dimensional longitudinal biomarker data. We aim to identify relevant biomarkers by regularizing the parameters linking them to the latent unobserved compartments while simultaneously estimating the population parameters from the structural mechanistic model. To do so, we are developing an iterative algorithm able to estimate and regularize all the parameters of the model. The algorithm iterates between a regularization step and a mechanistic inference step. The first step updates coefficients linking latent compartments to biomarkers by computing penalized log-likelihood derivatives, approximated via second-order Taylor development. The mechanistic inference step focuses on estimating the mechanistic parameters using the Stochastic Approximation Expectation-Maximization (SAEM) algorithm [4], implemented through the Monolix software, considering the updated regularized coefficients from the first step. This approach allows us to find which biomarker’s dynamics can accurately describe the dynamics of the unobserved compartments of our model.

Results : To demonstrate the performances of our methodology, we investigated the robustness of the proposed approach in simulations and used it in an application on immune responses to the Pfizer/BioNTech COVID-19 mRNA vaccine (BNT162b2) [5]. We examined daily collected blood samples of 15 infection-naïve individuals for 9 days following each one of the two vaccine injections. The dataset contains 8 172 gene expression measurements, regrouped in 34 pathways defined by Biomart [6]. We also used the serological samples obtained at baseline, day 7 and day 14 to inform on the observed compartment of antibodies. To find which gene accurately translates the dynamics of unobserved compartments in our postvaccination immune response model, we therefore apply our developed method. We estimate the model and show that the dynamics of inflammation, neutrophils, interferon, and type I interferon, measured from transcriptomic data, are most strongly associated with the establishment of the short-term immune response after vaccination.

Conclusion : We demonstrate that the use of transcriptomic data improves model identifiability by informing the dynamics of latent compartments. We also discuss several limitations, including the assumed linear relationship between transcriptomic measurements and cell compartments, as well as the high computational cost, and we outline possible directions for relaxation of these assumptions. The method is implemented in the REMixed R package, which is available on CRAN.

References:
[1] Perelson, A. S. and Ribeiro, R. M. (2018). Introduction to modeling viral infections and immunity. Immunological reviews, 285(1):5.
[2] Wilding, K. M., Molina-París, C., Kubicek-Sutherland, J. Z., McMahon, B., Perelson, A. S., & Ribeiro, R. M. (2025). A consensus mathematical model of vaccine-induced antibody dynamics for multiple vaccine platforms and pathogens. Frontiers in immunology, 16, 1596518.
[3] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society (Series B), 58:267–288.
[4] Kuhn, E. and Lavielle, M. (2005). Maximum likelihood estimation in nonlinear mixed effects models. Computational Statistics & Data Analysis, 49(4):1020–1038.
[5] Rinchai, D., Deola, S., Zoppoli, G., Kabeer, B. S. A., Taleb, S., …, Hssain,
A. A., Bedognetti, D., Grivel, J.-C., and Chaussabel, D. (2022). High–temporal resolution profiling reveals distinct immune trajectories following the first and second doses of covid-19 mrna vaccines. Science Advances, 8(45):eabp9961.
[6] Durinck, S., Moreau, Y., Kasprzyk, A., Davis, S., De Moor, B., Brazma, A., & Huber, W. (2005). BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics, 21(16), 3439-3440.

Reference: PAGE 34 (2026) Abstr 12110 [www.page-meeting.org/?abstract=12110]

Poster: Oral: Methodology - New Tools