2022 - Ljubljana - Slovenia

PAGE 2022: Methodology – AI/Machine Learning
Mark Sale

Comparison of Robustness and Efficiency of Four Machine Learning Algorithms for Identification of Optimal Population Pharmacokinetic Models

Mark Sale M.D. (1), Mohamed Ismail Pharm. D., M.S. (2, 4), Fenggong Wang Ph.D. (3), Kairui Feng Ph.D. (3), Meng Hu Ph.D. (3), Liang Zhao Ph.D., MBA (3), Robert Bies Ph.D. (4)

(1) Certara, (2) Enhanced Pharmacodynamics LLC, (3) FDA (4) State University of New York at Buffalo

Objectives: Compare the performance, robustness and efficiency of 4 common machine learning (ML) algorithms: Bayesian Optimization-Gaussian Process (GP), Genetic Algorithm (GA), Random Forest (RF) and Gradient Boosted Random Tree (GBRT) with exhaustive search as the reference

Methods: We have previously described the application of a ML algorithm (GA) to identify optimal models (1). This work expands that to include a comparison of other algorithms.

A simulated data set was constructed. The simulation model was:

  • Linear 2 compartment, first order absorption (ADVAN4), Typical Value (TV) for Clearance (CL) = 200 L/hr, TV for Central Volume (Vc) = 1000 L, Ka of 2/hr, each with log normal between subject variance of 0.2, K23 and K32 of 0.2/hr
  • An Absorption lag time with a TV of 0.2 hours (log normal variance of BSV = 0.2)
  • True covariates included: CL~ (Weight, bilirubin, race and ALT), Vc ~Weight. Ka~age

Samples were recorded for 50 subjects at 9 time points, for a total of 450 observations.

  • The search space consisted of 12,960 models. Dimensions of candidate features in the search space included:
    • Number of compartments (1,2,3)
    • Effect of Weight and Sex on Vc (present or absent)
    • Effect of Weight and Age on CL (present or absent, search did not include bilirubin, race or ALT)
    • BSV on CL, Vc and Ka (present or absent)
    • Zero order absorption (present or absent)
    • First order absorption (present or absent)
    • Residual Error models

Models were defined as an integer array (for GP, RF and GBRT) or a bit array (GA). For coding the GA models, the bit array is converted to an integers array.  These integers identified exactly one “feature” from each dimension included in the model, e.g., dimension 1 was the number of compartments with values of 1, 2 or 3.  A common framework and application was developed to run each of these algorithms.

All 12,960 candidate models were run to identify the “true best”. The true best model was:

  • 2 compartments
  • No covariates
  • Sequential zero order and first order absorption model

Each algorithm was run with a final “local search”. That is starting with the final model from the ML algorithm, the integer array was converted to a bit array and each bit (0|1) was changed and the resulting model run. For the 2-bit search, each bit was changed, and then all the other bits were individually changed, and the resulting model run.

Models were run in parallel, 40 at a time on a Windows server located at SUNY Buffalo.

The basis for the “goodness” of each model (termed “fitness” in GA and “cost” or “reward” in the other algorithms) was the objective function value (OFV). User-defined penalties were added to the OFV for other non-optimal outcomes, including:

  • Failure to converge – 100 points
  • Failed covariance step –100 points
  • Absolute value of off diagonal elements of the covariance matrix > 0.95 - 100 points
  • Parsimony penalty - 10 points for each estimated THETA/OMEGA or SIGMA element
  • Condition number > 1000 – 100 points

A posterior predictive check was also run for each model, comparing the observed mean Cmax with the mean of the simulated Cmax. The penalty for this was 5 points for each 1% absolute difference between the observed Cmax and the mean simulated Cmax.

Results: None of the ML algorithms found the true best model in the search space. A 1-bit local search at the end of the ML search also did not find the true best solution. A local search with a 2-bit radius was needed to find the true best model in all cases.

The simulation model was not found to be the true best model as it failed the correlation test. The true best model included a zero-order infusion but did not include the effect of weight on either Cl or Vc.

Conclusions: Deceptive model search spaces can occur. This has previously been described (2).  Each ML algorithm has assumption. These assumptions are that some pattern(s) exist in the search space, e.g., for GP, that the space defines a multivariate normal distribution, and, for RF and GBRF that the surface is convex. The assumptions for GA are more complex. Wade et. al. (2) demonstrated that these assumptions are typically not true, but rather that complex interactions between features of the search space should be anticipated. Consistent with this finding, we have demonstrated that ML algorithms alone cannot, at least in this case, find a true best solution and must be supplemented with a 1- or 2-bit radius local search.



References:
[1] E. Sherer, et al., Application of a single-objective, hybrid genetic algorithm approach to pharmacokinetic model building. Journal of Pharmacokinetics and Pharmacodynamics. 39(4): 393-414, 2012. 
[2] Wade JR, Beal SL, Sambol NC. Interaction between structural, statistical, and covariate models in population pharmacokinetic analysis. J Pharmacokinet Biopharm. 1994


Reference: PAGE 30 (2022) Abstr 10053 [www.page-meeting.org/?abstract=10053]
Poster: Methodology – AI/Machine Learning
Click to open PDF poster/presentation (click to open)
Top