Marian Klose 1,2, Dominik Marschner 3, Markus Knott 3,4, Julia Wendler 3,4, Julian Müller-Kühnle 3, Christin Nyhoegen 1, Niklas Hartung 5, Wilhelm Huisinga 2,5, Robin Michelet 1, Anna M. Mc Laughlin 1,6, Gerald Illerhaus 3,4, Charlotte Kloft 1,2
1 Freie Universität Berlin, Institute of Pharmacy, Department of Clinical Pharmacy and Biochemistry (Berlin, Germany), 2 Graduate Research Training Program PharMetrX (Berlin/Potsdam, Germany), 3 Department of Haematology/Oncology and Palliative Care, Klinikum Stuttgart (Stuttgart, Germany), 4 Stuttgart Cancer Center – Tumorzentrum Eva Mayr-Stihl, Klinikum Stuttgart (Stuttgart, Germany), 5 Institute of Mathematics, University of Potsdam (Potsdam, Germany), 6 Pharmetheus AB (Uppsala, Sweden)
INTRODUCTION
Nonlinear mixed-effects (NLME) models with Bayesian forecasting are widely used for model-informed clinical decision-making, including guidance of post-infusion care after high-dose methotrexate (HD-MTX) [1]. Delayed HD-MTX elimination (>72 h to drop below 0.2 µM) increases the risk of severe toxicities (e.g., acute kidney injury), making early identification of at-risk patients crucial for timely rescue interventions. Our group previously benchmarked an NLME-based full Bayesian forecasting approach (“Bayes-NLME”) at different clinical decision time points and found that predictive performance depended strongly on the availability and quality of individual patient data [2]. Poor performance at early time points (pre-dose, 4 h post-dose) supports the evaluation of alternative quantitative approaches. Machine learning (ML) is increasingly used in model-informed precision dosing (MIPD) [3], and hybrid methods combining ML with NLME have recently been proposed [4,5]. However, despite >30 published HD-MTX PK NLME models [6], few studies assessed PK-related predictions using ML [7,8,9], and it remains unclear whether direct or hybrid ML approaches can outperform established methods for identifying delayed MTX elimination.
OBJECTIVES
This work aimed to (i) evaluate direct and hybrid machine learning workflows to identify delayed elimination by predicting time to drop below 0.2 µM (t0.2µM), and (ii) compare its performance with the established Bayes-NLME reference approach.
METHODS
Clinical routine data from adult CNS lymphoma patients receiving i.v. high-dose methotrexate (HD-MTX), previously used for development of the reference Bayes-NLME model [2], were reused with an identical ID-based 75/25 train-test split for the present ML analysis. Candidate predictors for t0.2µM were selected during feature engineering for each decision time point (0, 4, 24, 48 h) per patient and cycle. Using tidymodels in R, separate workflows were built per time point in two modes: direct prediction of t0.2µM and hybrid prediction of the residual between observed t0.2µM and the Bayes-NLME median prediction. Candidate regression models included gradient-boosted decision trees (XGBoost and LightGBM), random forest, Cubist rule-based regression, elastic net regression, multivariate adaptive regression splines, support vector regression, k-nearest neighbors regression, and a feed-forward neural network. Algorithm-specific preprocessing (e.g., near-zero variance filtering, missing-data handling, collinearity control) was embedded in each workflow. Hyperparameters were tuned via 5-times repeated 10-fold cross-validation using ANOVA racing and space-filling grids (n=100), selecting the best model by Root Mean Squared Error (RMSE). Predictive performance on the cross-validated training data was compared against Bayes-NLME using continuous performance metrics, including RMSE and mean relative error (MRE), and categorical performance metrics, including true delayed rate (TDR), delayed predictive value (DPV), and normal predictive value (NPV). Feature importance was evaluated using SHAP (SHapley Additive exPlanations) values.
RESULTS
Across eight timepoint-prediction mode combinations, boosted tree models (XGBoost/LightGBM) were selected five times (62.5%), elastic net regression (glmnet) twice (25%), and random forest (ranger) once (12.5%). Overall, ML and Bayes-NLME showed comparable performance: ML (direct and hybrid) achieved slightly lower RMSE at all four decision time points, but differences were negligible (≤ 2.33 h). ML-hybrid achieved the lowest RMSE in 3/4 decision time points, compared with 1/4 for ML-direct. The limited RMSE decrease was accompanied by a comparable tendency to overpredict t0.2µM in both ML-direct (MRE: 0h: 12.4%; 4h: 12.7%; 24h: 8.84%; 48h: 1.80%) and ML-hybrid (MRE: 0h: 12.9%; 4h: 14.0%; 24h: 7.45%; 48h: 1.00%), whereas Bayes-NLME was largely unbiased (MRE: 0h: 1.4%; 4h: 1.9%; 24h: -1.0%; 48h: 1.6%). This overprediction tendency was also reflected in classification: While ML (TDR: 0h: 65%; 4h: 67%; 24h: 69%; 48h: 83%) identified on average a higher proportion of delayed eliminators across time points compared to Bayes-NLME (TDR: 0h: 39%; 4h: 43%; 24h: 59%; 48h: 81%), delayed predictions were also less reliable on average (DPV: ML 48/47/57/75% vs Bayes-NLME 59/58/70/79% at 0/4/24/48h). Normal predictions were on average 3.2%-points more reliable with ML than Bayes-NLME, with larger gains early (NPV +5.2%-points at 0h) and minimal difference later (NPV +1.08%-points at 48h). Across models, the most influential SHAP features were pre-dose haemoglobin, minimum MTX concentration within the cycle, minimum eGFR in previous cycles, pre-dose C-reactive protein, and MTX concentrations at 24-48 h.
CONCLUSION
Overall, ML achieved point-estimate performance comparable to Bayes-NLME while being computationally cheaper, but tended to predict higher-than-observed t0.2µM values. In contrast, Bayes-NLME was largely unbiased, quantified predictive uncertainty, and is expected to be more robust when extrapolated to other clinical scenarios.
References:
REFERENCES
[1] Z.L. Taylor, T. Mizuno, N.C. Punt et al. MTXPK.org: A Clinical Decision Support Tool Evaluating High-Dose Methotrexate Pharmacokinetics to Inform Post-Infusion Care and Use of Glucarpidase. Clinical Pharmacology & Therapeutics 108: 635–643 (2020).
[2] M. Klose, D. Marschner, M. Knott et al. A Bayesian-NLME approach identifies patients at risk of delayed MTX elimination if informative TDM data is provided. Abstracts of the 33rd annual meeting of the population approach group europe (PAGE) (2025).
[3] I.K. Minichmayr, E. Dreesen, M. Centanni et al. Model-informed precision dosing: State of the art and future perspectives. Advanced Drug Delivery Reviews 215: 115421 (2024).
[4] J.-B. Woillard, M. Labriffe, A. Prémaud et al. Estimation of drug exposure by machine learning based on simulations from published pharmacokinetic models: The example of tacrolimus. Pharmacological Research 167: 105578 (2021).
[5] A. Destere, P. Marquet, C.S. Gandonnière et al. A Hybrid Model Associating Population Pharmacokinetics with Machine Learning: A Case Study with Iohexol Clearance Estimation. Clin Pharmacokinet 61: 1157–1165 (2022).
[6] Y. Zhang, L. Sun, X. Chen et al. A Systematic Review of Population Pharmacokinetic Models of Methotrexate. European Journal of Drug Metabolism and Pharmacokinetics 47: 143–164 (2022).
[7] C. Jian, S. Chen, Z. Wang et al. Predicting delayed methotrexate elimination in pediatric acute lymphoblastic leukemia patients: an innovative web-based machine learning tool developed through a multicenter, retrospective analysis. BMC Medical Informatics and Decision Making 23: 148 (2023).
[8] C. Zhou, Y. Qian, Y. Xue et al. Risk factor identification for delayed excretion in pediatric high-dose methotrexate therapy: a machine learning analysis of real-world data. Front. Pharmacol. 16: (2025).
[9] M. Zhan, Z. Chen, C. Ding et al. Risk prediction for delayed clearance of high-dose methotrexate in pediatric hematological malignancies by machine learning. Int J Hematol 114: 483–493 (2021).
Reference: PAGE 34 (2026) Abstr 11877 [www.page-meeting.org/?abstract=11877]
Poster: Methodology – AI/Machine Learning