**Parallelization in optimal experimental design using PopED**

Joakim Nyberg, Eric Strömberg, Sebastian Ueckert and Andrew C. Hooker

Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden

**Objectives: **Optimal design for population models can be quite time consuming. A natural way to decrease the computer time is to execute the optimization and Fisher Information Matrix (FIM) calculation in parallel and take advantage of multiple core and cluster systems. This is especially suitable for the population FIM, since it is the sum of individual FIMs.

The objective in this work is to parallelize the open source optimal design tool PopED [1], preferably without the need of extra Matlab licences and additional software cost.

**Methods: **Open MPI [2] was chosen as the message passing interface because; 1) it is open source, 2) it is available for most operating system and 3) it uses a distributed memory architecture which is the most common cluster architecture and will also work well for multiple core computers. A Matlab shared library was built to execute the FIM calculations using the free Matlab Compiler Runtime (MCR). This is automatically accomplished in PopED by 1) compilation of the defined model into a shared library and 2) compilation of the Open MPI interface into an executable that calls the Matlab shared library. The number of cores/processors (n_{units}) to use is dynamic and defined in PopED prior to every run, where one unit is dedicated to be a job manager and the remaining, n_{units}-1, are workers. To have even more flexibility the number of designs to execute on a worker node before communicating with the job manager can be defined (n_{chunk}). For users with the Matlab Parallel Computing Toolbox (PCT) an option is available to use this method instead of the Open MPI.

Parallelization performance δ_{exp}≤n_{units}-1 was defined as the execution time relation t_{serial}/t_{parallel} where δ_{exp}=n_{units}-1 is the theoretical best performance for MPI and δ_{exp}=n_{units} for PCT, which can only be achieved by embarrassingly parallel methods with the time_{MPI}=0. FIM for two models (M_{fast}, M_{slow}) with different execution times (~1 sec and ~60 sec) were evaluated with 231 random designs and n_{chunk}=7.

**Results: **All of the search methods, Random Search, Stochastic Gradient, Line Search, Modified Fedorov Exchange Algorithm, available in PopED were successfully parallelized. For the two test models; M_{fast}: δ_{exp}=[2.1, 2.4 , 2.3] for n_{units}=[4,8,34] with MPI, M_{slow}: δ_{exp}=[2.7,6.3,26.7] for n_{units}=[4,8,34] with MPI. For PCT: M_{fast}: δ_{exp}=2.7 for n_{units}=4 and M_{slow}: δ_{exp}=4 for n_{units}=4.

**Conclusions: **The optimal design tool PopED has been parallelized which enables time consuming models to be executed within a more reasonable time frame without the loss of any accuracy.

**References:**[1]. PopED, version 2.11 (2011) http://poped.sf.net/.

[2]. Open MPI, version 1.4.3 (2010) http://www.open-mpi.org