Parallelization in optimal experimental design using PopED
Joakim Nyberg, Eric Strömberg, Sebastian Ueckert and Andrew C. Hooker
Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
Objectives: Optimal design for population models can be quite time consuming. A natural way to decrease the computer time is to execute the optimization and Fisher Information Matrix (FIM) calculation in parallel and take advantage of multiple core and cluster systems. This is especially suitable for the population FIM, since it is the sum of individual FIMs. 
The objective in this work is to parallelize the open source optimal design tool PopED [1], preferably without the need of extra Matlab licences and additional software cost.
Methods: Open MPI [2] was chosen as the message passing interface because; 1) it is open source, 2) it is available for most operating system and 3) it uses a distributed memory architecture which is the most common cluster architecture and will also work well for multiple core computers. A Matlab shared library was built to execute the FIM calculations using the free Matlab Compiler Runtime (MCR). This is automatically accomplished in PopED by 1) compilation of the defined model into a shared library and 2) compilation of the Open MPI interface into an executable that calls the Matlab shared library. The number of cores/processors (nunits) to use is dynamic and defined in PopED prior to every run, where one unit is dedicated to be a job manager and the remaining, nunits-1, are workers. To have even more flexibility the number of designs to execute on a worker node before communicating with the job manager can be defined (nchunk). For users with the Matlab Parallel Computing Toolbox (PCT) an option is available to use this method instead of the Open MPI.
Parallelization performance δexp≤nunits-1 was defined as the execution time relation tserial/tparallel where δexp=nunits-1 is the theoretical best performance for MPI and δexp=nunits for PCT, which can only be achieved by embarrassingly parallel methods with the timeMPI=0. FIM for two models (Mfast, Mslow) with different execution times (~1 sec and ~60 sec) were evaluated with 231 random designs and nchunk=7.
Results: All of the search methods, Random Search, Stochastic Gradient, Line Search, Modified Fedorov Exchange Algorithm, available in PopED were successfully parallelized. For the two test models; Mfast: δexp=[2.1, 2.4 , 2.3] for nunits=[4,8,34] with MPI, Mslow: δexp=[2.7,6.3,26.7] for nunits=[4,8,34] with MPI. For PCT: Mfast: δexp=2.7 for nunits=4 and Mslow: δexp=4 for nunits=4.
Conclusions: The optimal design tool PopED has been parallelized which enables time consuming models to be executed within a more reasonable time frame without the loss of any accuracy.
References:
[1]. PopED, version 2.11 (2011) http://poped.sf.net/.
[2]. Open MPI, version 1.4.3 (2010) http://www.open-mpi.org
