III-094

Model Ensembling and Machine Learning Approaches to Predict the First Dose of Amoxicillin in Intensive Care

Mihaly Leiwolf 1,2, Nicolas Gregoire 1,3, Sophie Magréault 4,5, Bénédicte Franck 6,7, Ombeline Krekounian 1,8, Jean-Baptiste Woillard 2,9, Vincent Aranzana-Climent 1

1 Inserm U1070 PHAR2, Université De Poitiers (Poitiers, France), 2 Inserm U1248 P&T, Université de Limoges (Limoges, France), 3 Department of Toxicology and Pharmacokinetics, University Hospital of Poitiers (Poitiers, France), 4 Department of Pharmacology, AP-HP, Groupe Hospitalier Paris Seine Saint-Denis (Bondy, France), 5 Inserm UMR1137, IAME, Université Paris Cité & Sorbonne Université (Paris, France), 6 Université Rennes, CHU Rennes, EHESP, Irset – UMR S 1085 (Rennes, France), 7 Inserm, Centre d’Investigation Clinique 1414 (Rennes, France), 8 Department of Clinical Pharmacology, Medical University of Vienna (Vienna, Austria), 9 Department of Pharmacology, Toxicology, and Pharmacovigilance, University Hospital of Limoges (Limoges, France)

Introduction
A priori model-informed precision dosing (MIPD) is a method to recommend an appropriate first dose based solely on the patient’s covariates. Despite its potential clinical benefits such as faster target attainment and no required concentration measurements, a priori MIPD remains less explored.
Objective
To develop and evaluate population pharmacokinetic (PopPK) model ensembling and machine learning (ML) approaches to predict a first dose of amoxicillin to reach trough concentrations of 40-80 mg/L [1] in intensive care unit (ICU) patients.
Methods
Following a bibliographic review, a large virtual patient population was simulated based on cohorts from 4 published amoxicillin PopPK models developed in adults [2-5].
Methods were developed on simulated data of 2500 subjects by reproducing the 4 model development cohorts. Covariate correlations were derived from MIMIC-IV, a large ICU dataset [6]. Serum creatinine, sex, age, and body weight were sampled from a multivariate distribution incorporating the derived correlation structure. Binary indicators were added for burn status, ICU, and obesity.
Steady-state trough concentrations were simulated with mrgsolve (R) using cohort-specific dosing regimens. Ground truth concentrations were generated with inter-individual variability using the subject’s original model. Population predictions were obtained for each subject using all 4 models and treated as predicted values. Model performance was evaluated by comparing these true and predicted concentrations.
As a reference, two previously published methods – weighed model ensembling (WME) and classification tree-informed (CT-inf) ensembling [7] – were applied to this context. WME attributes weights based on model performance in different covariate subgroups. Furthermore, models performing better in a subgroup where overall performance was low were assigned a higher weight. In CT-inf ensembling, a decision tree was fitted to each model, with the covariates as predictors and prediction correctness (falling within the bioequivalence range of 0.8-1.25 the true values) as binary target variable. The proportion of correct predictions within a leaf was used to derive model weights.
Two novel ensembling strategies were developed: Factor analysis of mixed data (FAMD) [8] was introduced, assigning model weights based on patient similarity to original model cohorts. After a dimensionality reduction, the test subjects were projected onto a latent space. Model weights were determined by the normalized reciprocal of Mahalanobis distances [9] between each test subject and model cohort centroids. Regression tree (RT)-inf ensembling predicts the log individual prediction/observation ratio, and attributes weights based on its reciprocal.
4 ML algorithms (support vector machine, k nearest neighbors, random forest, and XGBoost) were applied to predict the dose resulting in target concentration based on covariates and dosing scheme.
The ensembling and ML approaches were compared to the single-model approach (using the same PopPK model for all subjects), standard dose (200 mg/kg) and uninformed ensembling (same weight to all models).
These approaches were externally validated (without retraining) using clinical data of 74 ICU patients and 121 observations collected at the University Hospital of Poitiers and at Avicenne Hospital in Bobigny.
A sensitivity analysis was performed to assess the impact of different model training configurations with regards to model inclusion, dosing scheme and the number of subjects.
Results – Discussion
The benefit of MIPD was demonstrated over standard dose by ensembling methods (FAMD and RT-inf ensembling). The proportion of simulated patients achieving target concentrations increased from 16 % with the standard dose to 36-37 % with ensembling. In clinical data, target attainment improved from 29 % to 40-49 %.
Ensembling and ML methods outperformed the single model approach in simulated data. The newly developed methods (RT-inf ensembling and FAMD) not only increase target attainment, but by consistently outperforming uninformed ensembling, they also eliminate the need for model selection.
A discrepancy was observed between the performance on simulated and clinical datasets raising the question of extrapolability. These differences likely stemmed from the fact that the covariates in the PopPK models did not account for the variability in the clinical data and from the inability of some models to predict dose for specific patient subgroups.
Training and developing algorithms on simulated data tried to address the issue of generalizability. However, a key limitation of this simulation-based approach is that it relies on generating ground truth concentrations using PopPK models that may themselves be misspecified. This makes it difficult for ensembling algorithms to effectively eliminate inadequate models. For example, in the sensitivity analysis, where ML models trained on intermittent-infusion could not make accurate predictions for continuous-infusion, and vice versa. In contrast, PopPK ensembling methods showed greater flexibility, as the assumption of linearity in their structural models remained valid across different dosing schemes. Structural and identifiability problems of this kind were detected on simulated data where one of the models, Rambaud et al. [5], developed exclusively on continuous infusion, exhibited parameter identifiability problems, and consistently underpredicted concentrations for intermittent infusion. These issues can be resolved by simulating with each model only for the dosing schemes on which it was developed, and using dosing regimen as a predictor.
The number of models included in training and performance evaluation is critical, as fewer models increase the risk of performance being assessed primarily on a model’s own simulated data. Ensembling methods that incorporate model performance can appropriately evaluate the informational value of models applied outside their development cohort. FAMD, in contrast, relies solely on the similarity between a patient’s covariates and the development cohorts. This makes this approach inherently dependent on the underlying model performance and inadequate to identify models with good extrapolability or bad self-predictive capacity. However, this method remains unaffected by potentially biased performance estimates derived from the training data. FAMD demonstrated the highest target attainment (49 %) on the clinical validation data, when applied to continuous infusion.
Conclusion
ML-based PopPK model ensembling used as a complementary tool to a posteriori MIPD until the first blood sample is available, increases target attainment, enhances the robustness of predictions and overcomes the challenge of model selection. These methods are available as an R package (openMIPD – https://github.com/INSERM-U1070-PHAR2/openMIPD), making them ready to be implemented to other molecules and clinical scenarios.

References:
1. Guilhaumou et al. (2019). Optimization of beta-lactam antibiotics in critically ill patients. Crit Care 23(1):104.
2. Carlier et al. (2013). Population pharmacokinetics and dosing simulations of amoxicillin/clavulanic acid in critically ill patients. J Antimicrob Chemother 68(11):2600–2608.
3. Fournier et al. (2018). Population pharmacokinetic study of amoxicillin in burn patients. Antimicrob Agents Chemother 62:e00505-18.
4. Mellon et al. (2020). Population pharmacokinetics and dosing simulations of amoxicillin in obese adults receiving co-amoxiclav. J Antimicrob Chemother 75(12):3611–3618.
5. Rambaud et al. (2020). Development and validation of a dosing nomogram for amoxicillin in-infective endocarditis. J Antimicrob Chemother 75(10):2941–2950.
6. Johnson et al. (2023). MIMIC-IV, a freely accessible electronic health record dataset. Sci Data 10:1.
7. Agema et al. (2024). Selecting the best pharmacokinetic models for model-informed precision dosing with model ensembling. Clin Pharmacokinet 63(10):1449–1461.
8. Pagès (2004). Analyse factorielle de données mixtes. Rev Stat Appliquée 52(4):93 111.
9. Reprint of: Mahalanobis, P.C. (1936) « On the Generalised Distance in Statistics. » Sankhya A. dec 2018;80(S1):1‑7.

Reference: PAGE 34 (2026) Abstr 12048 [www.page-meeting.org/?abstract=12048]

Poster: Methodology - New Modelling Approaches