IV-087

mlxModelFinder: Machine Learning–Enhanced Search for Efficient Structural Model Discovery in Monolix

Frano Mihaljevic 1, Matthias Pierre 1,2, Julie Bertrand 2, Géraldine Cellière 1

1 Simulations Plus France (Antony, France), 2 Université Paris Cité et Université Sorbonne Paris Nord, IAME, INSERM, F-75018 (Paris, France)

Introduction:
Automated structural model building has previously been explored, including a systematic comparison of six algorithms across simulation studies and real-world applications [1]. These studies demonstrated that algorithm performance depends strongly on the size of the structural search space: Decision Tree (DT) approach performs best in relatively small search spaces, whereas Ant Colony Optimization (ACO) and Tournament-based strategies are more efficient and robust in larger, more complex spaces.
In this work, we further fine-tuned these three top-performing algorithms to enhance their efficiency and robustness. We implemented the optimized methods in an easy-to-use R package, mlxModelFinder, which interfaces directly with MonolixSuite for parameter estimation and model diagnostics. The package is available for download from the MonolixSuite documentation website [2].

Objectives:
The objectives of this work were to:
1. Compare different structural search algorithms in terms of accuracy and computational efficiency using real datasets, and benchmark their performance against exhaustive search.
2. Improve the decision tree algorithm by identifying an optimal sequence of structural features.
3. Enhance the Ant Colony Optimization (ACO) algorithm using alternative optimization procedures, including machine learning–based ranking of candidate models.
4. Evaluate the generalizability of the proposed algorithms beyond classical PK search spaces by applying them to more complex model types (e.g., TGI, more complex PK models, parent–metabolite models).

Methods:
The searchable space included structural PK components such as absorption (first-order, zero-order, sigmoid), distribution (one to three compartments), absorption delay, IIV structures, and residual error models. The small model library contained 54 candidate models, whereas expanded search space included 8,064 models.
The following search strategies were implemented and compared:
● Exhaustive search (reference approach)
● Decision tree–based hierarchical search
● Ant Colony Optimization (ACO) with multiple optimization variants
The decision tree approach decomposes the search into sequential components (e.g., absorption/delay → distribution → elimination), thereby reducing combinatorial explosion. ACO explores the structural space probabilistically, updating component selection based on model performance. The machine learning–enhanced variant trained an XGBoost model on previously evaluated candidates to predict and prioritize promising untested models, improving exploration efficiency.
Model comparison was based on BICc in small search spaces and augmented with parameter precision penalties in larger search spaces. Performance was evaluated using 16 PK datasets. Algorithms were compared according to:
(i) distance from the best model identified by exhaustive search (ΔBICc), and
(ii) number of models evaluated.
Support for custom model libraries was evaluated using TGI, TMDD, and parent–metabolite models, as well as PK datasets with expanded search spaces (e.g., dose-dependent bioavailability, saturable absorption).

Results:
In a small search space (54 models), near-optimal models were identified after evaluating approximately 8–12 models (≈15–25% of the space), with differences from the exhaustive-search optimum generally below 3 BICc units. The best-performing sequence of components was distribution → absorption/delay → distribution → elimination.
In a larger search space (8,064 models), the machine learning–enhanced ACO identified competitive models after evaluating approximately 120 models (<2% of the space). The distance from the exhaustive-search best model remained limited (on average below 5 units in the selected cost function), demonstrating a favorable balance between accuracy and computational burden. The machine learning–enhanced ACO provided the best overall trade-off between accuracy and speed. Compared with standard ACO, the ML-guided approach required fewer model evaluations while maintaining small ΔBICc values relative to the best identified models. Across datasets, this approach demonstrated improved stability and convergence toward competitive structural models, effectively narrowing the gap to exhaustive search without full enumeration. Applications using custom model libraries on real datasets confirmed that extending the search space with user-defined structures preserved automation and computational efficiency. Conclusions: mlxModelFinder translates previously benchmarked automated model-building algorithms into a robust R package integrated with MonolixSuite. By combining heuristic optimization, machine learning guidance, customizable libraries, and scalable cluster execution, it enables efficient, reproducible, and flexible structural model selection, even within very large model spaces. References: [1] Implementation and comparison of six algorithms for automated model building in Monolix: two simulation studies and ten applications. PAGE Meeting Abstracts. https://www.page-meeting.org/Abstracts/implementation-and-comparison-of-six-algorithms-for-automated-model-building-in-monolix-two-simulation-studies-and-ten-applications/ [2] https://monolixsuite.slp-software.com/r-functions/2024R1/package-mlxmodelfinder [3] Dorigo M, Stützle T. Ant Colony Optimization. MIT Press; 2004. [4] Chen T, Guestrin C. XGBoost: A scalable tree boosting system. KDD; 2016.

Reference: PAGE 34 (2026) Abstr 12106 [www.page-meeting.org/?abstract=12106]

Poster: Methodology – AI/Machine Learning