It’s a match! Machine learning & compartmental modeling for early PK predictions

Felix Jost1, Clemens Giegerich2, Christoph Grebner3, Hans Matter3, Henrik Cordes1

1Sanofi, Translational Medicine Unit, Research Pharmacometrics, 2Sanofi, Translational Medicine Unit, Disease Modeling, 3Sanofi, Integrated Drug Discovery

Introduction: Recent advances in big data, AI/ML, and affordable computing power promise to streamline drug development [1] . Early-stage discovery can benefit from improved prediction of chemical properties and ADME behavior before synthesis [2,3] . Reliable predictions of in vivo outcomes could significantly reduce animal usage, costs, and time in drug screening. Recently, several machine learning approaches to predict in vivo pharmacokinetic (PK) profiles were published [4–15]. They differ in methodological details and in the underlaying data sets that vary in size, quality, investigated species, the used sampling/splitting strategy for test, training and validation sets and the used quality or performance metric. Here, an objective comparison or benchmarking across such approaches is rather challenging or even impossible. To this end we provide a systematic comparison across multiple ML methods based on a consistent data set, splitting strategy, and performance measures to enabling a direct benchmarking to predict in vivo rat plasma PK profiles for small molecules. Methods: We analyzed data from 739 study arms across 721 in vivo rat PK studies after intravenous administration, encompassing 696 different molecules and 14,155 plasma PK data points. We compared three model-based hybrid methods and one ML method against a reference hybrid approach using NCA parameters with a one-compartmental PK model (NCA-ML). The second method uses ML predicted in vitro characteristics which serve as input for a PBPK model to predict the in vivo PK profiles (PBPK-ML). 10 different PBPK-ML models were set up differing in their underlaying partitioning and permeability calculation model. The two other methods range between approaches using NCA and mechanistic PBPK models. In both approaches a neural network is trained to predict parameters of a compartmental PK model, which then generates concentration-time profiles. The approaches differ in the underlying training methodologies (CMT-ML & CMT-PINN). In CMT-ML a NN is trained on estimated PK parameters from compartmental models, whereas in CMT-PINN the NN is directly trained on the concentration-time profiles. Besides the hybrid models the last approach is a pure ML approach (PURE-ML) which is also trained on PK profiles without using compartmental PK models. Performance evaluation included general metrics (R², MAPE, MDAPE, Spearman correlation), predicted PK properties (AUC, C0, Cmin), and quantitative assessment of the proportion of predicted-to-observed mean concentration data points within two-fold and three-fold error margins. Additionally, geometric mean fold error (GMFE) was calculated on estimated and predicted concentration-time profiles. This observation-independent metric facilitates cross-study comparisons with previously published work. Results: Comparing the hybrid methods NCA-ML (0.44,97,14,0.87,1.5,3.9,5.7), PBPK-ML (0.28-0.64,51-79,21-34,0.53-0.84,1.0-1.9,1.3-5.4,0.2-1.2), CMT-ML (0.47,90,16,0.85,2.5,1.8,45) and CMT-PINN (0.85,25,9,0.93,1.1,1.1,0.6) across all metrices, the CMT-PINN approach performed best showing the lowest MAPE (1) and MDAPE (2),the highest R2 (3) on log-scaled PK data and relative values closest to 1 for AUC (5), C0 (6) and Cmin (7). Spearman correlation (4) was also highest for CMT-PINN. The PURE-ML (0.79,22,10,0.90,1.1,1.6,0.7) approach shows comparable results on the general and PK metrices compared to the CMT-PINN approach. Regarding the percentage of predicted vs. observed mean concentration data points within a two-fold and three-fold error from the test set, the CMT-PINN (65.9% and 83.5%) approach provides the highest values with 4-5 percent points more than PURE-ML (61.0% and 79.7%). NCA-ML, CMT-ML and PBPK-ML have values lower than 50%. The median GMFE confirms that CMT-PINN approach (2.1) outperforms NCA-ML (8.2), CMT-ML (7.4) and PBPK-ML (3.5-7.8) whereas CMT-PINN and PURE-ML (2.09) share same median GMFE. Conclusion: We benchmarked five ML methods on identical datasets and processing methodology for small molecule plasma PK prediction. CMT-PINN and PURE-ML demonstrated the highest prediction accuracy. Hybrid methods improved when trained directly on concentration-time data rather than pre-calculated PK parameters. While pharmacometricians favor interpretable hybrid approaches for their physiological basis, integrated compartmental-ML methods now significantly impact drug discovery at scale. These tools, accessible to all R&D via Shiny applications, enable scientists to rapidly simulate PK as well as PK/PD studies using qualified AI/ML predictions within minutes, facilitating data-driven decision making across research projects.

1. Alowais, S. A. et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Me´d. Educ. 23, 689 (2023). 2. Rehman, A. U. et al. Role of Artificial Intelligence in Revolutionizing Drug Discovery. Fundam. Res. (2024).doi:10.1016/j.fmre.2024.04.021 3. Blanco-González, A. et al. The Role of AI in Drug Discovery: Challenges, Opportunities, and Strategies. Pharmaceuticals 16, 891 (2023). 4. Walter, M. et al. In silico PK predictions in Drug Discovery: Benchmarking of Strategies to Integrate Machine Learning with Empiric and Mechanistic PK modelling. bioRxiv 2024.07.30.605777 (2024).doi:10.1101/2024.07.30.605777 5. Beckers, M., Yonchev, D., Desrayaud, S., Gerebtzoff, G. & Rodríguez-Pérez, R. DeepCt: Predicting pharmacokinetic concentration-time curves and compartmental models from chemical structure using deep learning. (2024).doi:10.26434/chemrxiv-2024-vg9h7 6. Pillai, N., Abos, A., Teutonico, D. & Mavroudis, P. D. Machine learning framework to predict pharmacokinetic profile of small molecule drugs based on chemical structure. Clin. Transl. Sci. 17, e13824 (2024). 7. Mavroudis, P. D., Teutonico, D., Abos, A. & Pillai, N. Application of machine learning in combination with mechanistic modeling to predict plasma exposure of small molecules. Front. Syst. Biol. 3, 1180948 (2023). 8. Führer, F. et al. A deep neural network: mechanistic hybrid model to predict pharmacokinetics in rat. J. Comput.-Aided Mol. Des. 38, 7 (2024). 9. Handa, K. et al. Prediction of Compound Plasma Concentration–Time Profiles in Mice Using Random Forest. Mol. Pharm. 20, 3060–3072 (2023). 10. Gruber, A. et al. Prediction of Human Pharmacokinetics From Chemical Structure: Combining Mechanistic Modeling with Machine Learning. J. Pharm. Sci. 113, 55–63 (2024). 11. Schneckener, S. et al. Prediction of Oral Bioavailability in Rats: Transferring Insights from in Vitro Correlations to (Deep) Machine Learning Models Using in Silico Model Outputs and Chemical Structure Parameters. J. Chem. Inf. Model. 59, 4893–4905 (2019). 12. Naga, D., Parrott, N., Ecker, G. F. & Olivares-Morales, A. Evaluation of the Success of High-Throughput Physiologically Based Pharmacokinetic (HT-PBPK) Modeling Predictions to Inform Early Drug Discovery. Mol. Pharm. 19, 2203–2216 (2022). 13. Stoyanova, R. et al. Computational Predictions of Nonclinical Pharmacokinetics at the Drug Design Stage. J. Chem. Inf. Model. 63, 442–458 (2023). 14. Andrews-Morger, A., Reutlinger, M., Parrott, N. & Olivares-Morales, A. A Machine Learning Framework to Improve Rat Clearance Predictions and Inform Physiologically Based Pharmacokinetic Modeling. Mol. Pharm. 20, 5052–5065 (2023). 15. Obrezanova, O. et al. Prediction of In Vivo Pharmacokinetic Parameters and Time–Exposure Curves in Rats Using Machine Learning from the Chemical Structure. Mol. Pharm. 19, 1488–1504 (2022).

Reference: PAGE 33 (2025) Abstr 11486 [www.page-meeting.org/?abstract=11486]

Poster: Methodology – AI/Machine Learning

PDF poster / presentation (click to open)