Lina Keutzer (1), Huifang You (1), Ulrika SH Simonsson (1)
(1) Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
Objectives: Pharmacometrics (PMX) plays a vital role in various areas ranging from model-informed drug development, regulatory decision-making to model-informed precision dosing. In pharmacokinetic-pharmacodynamic (PKPD) analysis, pharmacokinetic (PK) data is commonly used as input to drive the exposure-response relationship, which can be both continuous drug concentrations or secondary PK parameters. This commonly demands the availability or development of a population PK model first, which can be time and labor intense. Machine leaning (ML) stands out with its high computational efficiency and is being increasingly used in drug development (1). Thus, ML could support PMX with faster analysis by using PK predicted from a ML algorithm as input for the PKPD model, provided that the predictive performance is acceptable. In this work we explore the ability of different ML algorithms to select predictors (= features) correctly and to accurately and precisely predict plasma concentration over time using rifampicin as an example.
Methods: Rifampicin plasma concentrations over time for 83 virtual tuberculosis patients were simulated from a previously published population PK model (2) with sample size and covariate distribution in accordance with the original study design (3) using NONMEM (4). The true predictors included in the simulations were time after dose (TAD), sampling week and dose, which are part of the structural PMX model, as well as the covariate fat-free mass (FFM). PK sampling was simulated to occur at days 7 and 14 at 0, 0.5, 1, 1.5, 2, 3, 4, 6, 8, 12, and 24h post-dose. The dataset consisting of 1826 observations was split randomly into a training dataset (80%) used to build the ML model and a test dataset (20%) used to evaluate the model’s predictive performance. Using the training dataset as input, different ML algorithms were trained for selection of predictors and to predict plasma concentration over time using cross-validation. Feature selection was performed with the ML algorithms Boruta (5), XGBoost (6), GBM (7) and Random Forest (8). Evaluated predictors were TAD, sampling week, dose, bodyweight (WT), FFM, gender, age, height, race and HIV co-infection. For prediction of plasma concentration over time, the ML algorithms GBM, XGBoost, Random Forest and an ensemble model (9) combining all three were evaluated. The algorithms were optimized by testing different model parameters (hyperparameter tuning). Model performance was evaluated using the R2 between observations and predictions, the relative root mean square error (rRMSE) as well as graphical evaluation. Model building was performed in R version 3.6.3 (10).
Results: All four algorithms correctly selected TAD and dose as most important predictors. Random Forest and Boruta correctly identified FFM (true covariate) as third most important predictor, whereas GBM selected WT with slightly higher importance than FFM, which could be due to their high correlation. XGBoost incorrectly selected sex as third most important predictor followed by WT and FFM.
With regards to prediction of plasma concentration over time, XGBoost performed best with the highest R2 (0.70) and lowest imprecision (rRMSE: 63.7%). The ensemble model combining all three algorithms, as well as GBM also performed well with both having a R2 of 0.69 and a rRMSE of 65.2%. Random Forest had the lowest R2 (0.66) and highest imprecision (rRMSE: 68.7%). Graphical evaluation revealed a good predictive performance across all tested algorithms.
The results demonstrate that Boruta and Random Forest are capable of correctly selecting predictive features. XGBoost showed the best predictive performance in regards to plasma concentration over time. However, all four algorithms showed acceptable predictive performance regarding plasma concentrations, considering the small dataset.
Conclusions: This work indicates that ML can be a useful tool for fast covariate selection and subsequent prediction of PK to be used as input into a PKPD model. Bridging PMX and ML seems very promising considering that ML can add value to PMX workflows through increased computationally efficiency, whereas PMX methods can be used to improve interpretability of ML approaches and provide hypothesis testing, which is crucial for regulatory interactions. To utilize the great potential of both methods, we propose PMX and ML to join forces to improve computational efficiency, model performance and predictivity.
References:
[1] Réda, C., Kaufmann, E. & Delahaye-Duriez, A. Machine learning applications in drug development. Computational and Structural Biotechnology Journal 18, 241–252 (2020).
[2] Svensson, R. J. et al. A Population Pharmacokinetic Model Incorporating Saturable Pharmacokinetics and Autoinduction for High Rifampicin Doses. Clin Pharmacol Ther 103, 674–683 (2018).
[3] Boeree, M. J. et al. A dose-ranging trial to optimize the dose of rifampin in the treatment of tuberculosis. Am. J. Respir. Crit. Care Med. 191, 1058–1065 (2015).
[4] Beal SL et al. 1989-2011. NONMEM Users Guides. Icon Development Solutions, Ellicott City, Maryland, USA.
[5] Kursa, M. B. & Rudnicki, W. R. Boruta: Wrapper Algorithm for All Relevant Feature Selection [R package Boruta version 7.0.0]. (Comprehensive R Archive Network (CRAN), 2020).
[6] Chen, T. xgboost: Extreme Gradient Boosting [R package xgboost version 1.4.1.1]. (Comprehensive R Archive Network (CRAN), 2021).
[7] Greenwell, B. gbm: Generalized Boosted Regression Models [R package gbm version 2.1.8]. (Comprehensive R Archive Network (CRAN), 2020).
[8] Breiman, L. randomForest: Breiman and Cutler’s Random Forests for Classification and Regression. (Comprehensive R Archive Network (CRAN), 2018).
[9] Deane-Mayer, Z. A. & Knowles, J. E. caretEnsemble: Ensembles of Caret Models [R package caretEnsemble version 2.0.1]. (Comprehensive R Archive Network (CRAN), 2019).
[10] R Core Team. R: A language and environment for statistical computing. (R Foundation for Statistical Computing, 2015).
Reference: PAGE 29 (2021) Abstr 9808 [www.page-meeting.org/?abstract=9808]
Poster: Methodology - Other topics