**Automated proper lumping for simplification of systems models**

Shan Pan, Stephen Duffull

School of Pharmacy, University of Otago, Dunedin, New Zealand

**Background: **Systems pharmacology models are created to explore detailed mechanisms of drug behaviour. However due to large dimensionality and complexity they cannot be easily utilised as the basis for modelling of data-driven pharmacokinetic-pharmacodynamic (PKPD) studies. With model order reduction methods systems models can be simplified into simpler structures while retaining similar input-output relationships. Lumping is one model order reduction technique that merges original states into fewer pseudo-states in a reduced system. Proper lumping as a special case of lumping that merges one or more original states into one pseudo-state in the reduced system. The reduced states after proper lumping retain their physical meaning as in the original system. Recently a proper lumping technique has been used for the simplification of a large-scale systems pharmacological model [1]. Other complex systems such as physiologically based pharmacokinetic (PBPK) models have been lumped by merging the tissues with similar specification (i.e. in serial or parallel connection) [2]. Automating the process of system simplification represents a large combinatorial search problem.

**Aims:** The overall aim of this work was to develop an automatic process for the simplification of a complex model. Specific objectives were to: (1) develop an automatic proper lumping technique, (2) apply the automated process for the simplification of a PBPK model.

**Materials & Methods: **A PBPK model for fentanyl was identified from the literature [3] and used as the application example. This model predicted the arterial fentanyl concentrations over time in humans after an intravenous infusion of fentanyl. In total there were 17 states with liver as the site of metabolism. The PBPK system was written as ordinary differential equations (ODEs). For this example application of model simplification, it was considered desirable that the arterial concentration-time curve from the lumped model shows good agreement with the original profile. The criterion for an acceptable lumped model was defined as the total area under the arterial concentration-time curve (AUC) between lumped and original models was set to differ by 0.002% at maximum (termed ARD% for absolute value of the relative difference expressed as a percent). Note this criterion is arbitrary.

Proper lumping has previously been described by Dokoumetzidis and Aarons [4]. A general form of the model for a vector of model predictions (y) is given by

dy/dt = K × y.

Here K is the micro-rate constant matrix. In this technique the lumped micro-rate constant matrix (K_{L}) can be obtained by the relationship of K, the lumping matrix (M) and the pseudo-inverse of lumping matrix (M^{+}) by

K_{L} = M × K × M^{+}.

The resulting lumped micro-rate constant matrix is then used to simulate the concentration-time profile of the lumped system. The lumping matrix (M) is a user defined m × n matrix composed of 0s and 1s, where m is the number of lumped states and n is the number of original states. The M matrix transforms the states between the original and lumped systems. Note the M matrix for the setting where the lumped model equations are the same as the original model is the identity matrix (I_{n}) of dimension n (i.e. m = n).

The M matrix is specified where the sum of each column is 1 and the sum of each row ≤ n. The optimal M matrix is defined as having the minimum number of states (min(m); m ∈ {1,2,…,n}) that satisfies the criterion.

Methods of searching the matrix developed in this work included: (1) full enumeration, (2) non-adaptive random search (NARS), (3) scree plot plus NARS.

__Full enumeration__: all legal lumping matrices were exhaustively searched. The search started from the fully lumped matrix (m = 2) and then incremented one row each time until the criterion was accepted.

__NARS__: legal lumping matrices were constructed randomly. The search started from the fully lumped matrix (m = 2) and m was incremented by one after each NARS was completed or until the criterion was accepted. Number of random samples per increment in m was tested with 10, 100, 1,000 or 10,000 or 100,000 or 1,000,000.

__Scree plot plus NARS__: A scree plot was used to visualise the influence of the compartmental structures. In the scree plot, the eigenvalues of the K matrix were plotted in rank order against the ranked state number. Either a cut-off point of an eigenvalue of 1 or a change in the slope of the scree plot indicated an initial estimate of the number of states in the reduced model. This was used as the starting point for NARS.

The automatic methods described above were tested individually for simplification of original fentanyl PBPK model. In all methods it was constrained that the artery as the output state was unlumped during the search process.

**Results: **

__Full enumeration__: a 4-state lumped model was found after 40 minutes where ARD% was 0.0001% satisfying the exploratory criterion (i.e. ARD% < 0.002%). The minimum ARD% for all lumped models with m = 4 took more than two days.

__NARS__: Stationary ARD distribution was formed until 10,000 random samples and lower ARD was shown with the increment in m.

With 10 and 100 samples the random searched did not find a lumped model that satisfied the criterion (i.e. the full model was the only acceptable model). With 1,000 samples, a 14-state lumped model was found after 15 seconds. With 10,000 samples, a 6-state lumped model was found after one minute and after 100,000 samples per iteration a 5-state lumped model was found after five minutes. With 1,000,000 samples, a 4-state model was found after 30 minutes.

__Scree plot plus NARS__: The eigenvalues of the first four states were above 1 and therefore may provide the basis of those that are informative and could be left unlumped as the first iteration. The slope in the scree plot clearly levelled off up to the fourth state and also indicated four or five states in the lumped model. ARD% for both lumped models were above 38%. The lumped states were gradually unlumped and a lumped model within the ARD% criterion was not found.

In NARS the search then started from four states with 10,000 samples, and a 7-state lumped model was found after 40 seconds.

**Conclusions: **We have demonstrated that different automatic processes can be applied to simplify an existing PBPK model. It is evident that full enumeration for anything other than a simple model is not practical. The scree plot approach although informative was not optimum. In this (relatively) straightforward example, NARS with 1,000,000 samples per iteration was both relatively quick and found the optimum. The methods described here are general and more specific and efficient methods may be required for large scale problems (n > 50).

**References: ** [1] Gulati A et al. CPT Pharmacometrics Syst Pharmacol. 2014;3: e90.

[2] Nestorov I et al. J Pharmacokinet Biopharm 1998;26(1): 21-46.

[3] Björkman S et al. J Pharmacokinet Biopharm 1994;22(5): 381-410.

[4] Dokoumetzidis A and Aarons L. IET Syst Biol. 2009;3(1): 40-51.