Development and evaluation of nlmixr2auto: Automated open-source model building

Zhonghui Huang(1), Joseph F Standing (1), Matt Fidler (2), Frank Kloprogge (3)

(1) Infection, Immunity and Inflammation Research & Teaching Department, UCL Great Ormond Street Institute of Child Health, London, United Kingdom (2) Novartis Pharmaceuticals, USA (3) UCL Institute for Global Health, University College London, London, United Kingdom

Introduction/Objectives. Complex pharmacometric model building reduces the time available to focus on interpretation and application of results. Automated modelling leverages strategic search algorithms to enhance model selection and optimisation. Human stepwise model-building approaches may not always arrive at the best-performing model, and automated model-building algorithms proposed to date have focussed on only a very limited number of algorithmic approaches and/or been limited to applications with commercial software.

It has recently been shown that different automated PK model-building algorithms may perform better on different types of data [1], with the stepwise being fast but possibly missing the true optimal solution and the Ant Colony Optimisation (ACO) algorithm outperforming the genetic algorithm (GA) with a 1-bit search on sparse data. Another possibly interesting algorithm called Tabu [2], which fundamentally uses a ” Tabu List” to avoid exploring repeated elements to improve the search efficiency. For PK modelling, tabu search is expected to retain both exhaustive and stepwise characteristics, with the hope of achieving local exhaustive searches across structural, statistical, and covariate models while also increasing the chances of finding the global optimal solution. Tabu search may also be useful but has yet to be subject to detailed study and in pharmacometric applications.

The open-source R [3] package nlmixr2 [4] is available for nonlinear mixed effect modelling work and increasingly used in population pharmacokinetic research. It provides an ideal platform to test different automated modelling options and to provide automated model-building facilities to researchers who cannot access commercial platforms.

The study aimed to establish a fully automated PK modelling pipeline, nlmixr2auto, to achieve an end-to-end automated process from dataset input to corresponding best model output, including the option to implement and benchmark different automated model-building algorithms.

Methods.

Data. Intravenous datasets provided by nlmixr2data [5] including (Bolus_1CPT, Bolus_2PT, Bolus_1CPTMM, Bolus_2PTMM, Infusion_1CPT, Infusion _2PT, Infusion_1CPTMM, Infusion_2PTMM, pheo_sd) were run for all algorithms. In addition, three clinical trial datasets (fluorouracil, diazepam and tobramycin) were also used.

Nlmixr2auto contains four modules: the search space module, the core search strategy module, the automated initial estimate module, and the model result interpretation and evaluation.

Search space. Number of compartments, statistical models including interindividual variability (IIV) in parameters (IIV-cl was set as default) and three types of residual error, the M-M elimination, weight as the covariate can all be specified. It is possible to narrow the search space if required.

Search Algorithms. The core search algorithms currently available include stepwise, GA, ACO, and Tabu Search. Exhaustive search is also available and has been used to provide a “ground truth” of the best model for a given dataset. GA applied a hybrid method that combines a 1-bit local search for every three iterations with a tournament selection. ACO used a simple exploration-focused mode, where pheromones were set based on the inverse rank of model fitness. Both GA and ACO considered the elitism strategy to always keep the best solution in history. Tabu search started from one pre-defined simple model and tabu targets were set as the changes of model elements found in the best model, with a tabu tenure of 3 iterations. All algorithms used a fitness function with penalty terms considering relative standard errors (RSE), shrinkage and values of residual error variance as the optimised target.

Initial estimates. Initial estimates allow both user-defined values or values generated through internal calculation, of which the latter applied trapezoidal rule for normalised concentration. Additionally, a naive pooled nonlinear linear least square regression (NLS) estimation in nlmixr2 was used for absorption, clearance and volume distribution.

Nlmixr2 package in R software was used as the tool for nonlinear-mixed effect modelling.

Results. The initial estimate algorithm was used to provide the initial input for modelling and parameter estimation. Results indicated that initial estimates generated from this algorithm can achieve good convergence of parameter estimates across all test datasets under the pre-defined models.

Four search algorithms, including stepwise, tabu, GA, and ACO, were tested in simulated datasets and clinical trial datasets from publications. Overall, these four algorithms performed well on datasets from nlmixr2data. GA, ACO and tabu can successfully recover key model characteristics for all nlmixr2data cases. However, the stepwise algorithm might have an issue with unstable accuracy in model selection, especially for complex models. For example, it failed to identify nonlinear elimination in Bolus_1PTMM dataset, yet can select it in the Bolus_2CPTMM (a dataset from two compartment model with Michaelis-Menten (MM) elimination. Stepwise algorithm ran quite faster than other search algorithms due to limited number of models (no larger than 20 models currently), which stands as its advantage.

For the three clinical trial data, GA worked for two of three cases. However, in the case of tobramycin, a 1-bit search GA incorrectly identified the model as one compartment, further leading to inaccurate prediction of other model elements. This might be related to the initial population that GA generated or it suggested a need for diversity design. ACO and Tabu performed better in three cases than and selected the consistent models with the exhaustive search. For the Tabu search, its accuracy highly depended on the starting points and the time spent varied across cases. Sometimes it might experience longer convergence time if the local exhaustive space was far from the best solution.

Conclusion. Nlmixr2auto works well in finishing population pharmacokinetic modelling tasks and can expect to be a new tool to guide modeller.

References:
[1] PAGE 31 (2023) Abstr 10704 [www.page-meeting.org/?abstract=10704]
[2] Glover, Fred. “Tabu search: A tutorial.” Interfaces 20.4 (1990): 74-94.
[3] R Core Team (2023). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
[4] Fidler, Matthew, et al. “Nonlinear mixed-effects model development and simulation using nlmixr and related R open-source packages.” CPT: pharmacometrics & systems pharmacology 8.9 (2019): 621-633.
[5] https://cran.r-project.org/web/packages/nlmixr2data/index.html

Reference: PAGE 32 (2024) Abstr 11272 [www.page-meeting.org/?abstract=11272]

Poster: Methodology - New Modelling Approaches