Using simulations-based metrics to detect model misspecifications
Emmanuelle Comets, France Mentré
INSERM UMR738, University Paris Diderot, 16 rue Henri Huchard, 75 018 Paris, France
Objectives: Model evaluation is an important part of model building, and has been the subject of regulatory guidelines. Weighted residuals have long been used for model diagnostics, but their computation is based on a linearisation of the model and their shortcomings have been demonstrated . Prediction discrepancies (pd) and normalised prediction distribution errors (npde) have been proposed which take into account the full predictive distribution  and have better statistical properties . In the present paper we present an alternative way to compute npde which avoids an approximation during the decorrelation step. We also illustrate the use of npde, pd, and VPC on several simulated datasets.
Methods: We will assume that a model MB has been built using a building dataset B. Our null hypothesis is that this model can be used to describe the data collected in a validation dataset V (which can be B in internal evaluation).Visual Predictive Checks (VPC), prediction discrepancies (pd) and normalised prediction distribution errors (npde) all belong to the general class of posterior predictive check, whereby model MB is used to simulate data according to the design of V, and the metric computed on the real data in V is compared to the distribution of the same metric obtained through the simulations. Visual Predictive Checks are obtained by plotting prediction intervals over time. Prediction discrepancies are computed as the quantile of the observation in the corresponding predictive distribution, while npde are computed similarly but after decorrelating both simulated and observed data . Scatterplots of pd and npde versus time or predicted concentrations can be used to evaluate different aspects of model misspecification.
Results: Prediction bands around selected percentiles can be obtained through repeated simulations under the model being tested, and their addition to VPC plots or plots of pd and npde versus time and predictions are useful to highlight model deficiencies. Finally, tests can be used to assess whether the npde follow their theoretical standard normal distribution, and provide a complement to graphs. Datasets were simulated under several conditions: a first dataset was simulated with the same model used to compute the metrics; three other datasets were simulated with different model misspecifications. Applying the different tests and graphs to the metrics computed for each dataset, we show how various model misspecifications can be detected.
Conclusions: Graphs with prediction bands were found to be especially useful and visually appealing, while tests based on the npde were able to detect model misspecification for the three simulated datasets.
 F Mentré, S Escolano (2006). Prediction discrepancies for the evaluation of nonlinear mixed-effects models. Journal of Pharmacokinetics and Biopharmaceutics, 33: 345-67
 K Brendel, E Comets, C Laffont, C Laveille, F Mentré (2006). Metrics for external model evaluation with an application to the population pharmacokinetics of gliclazide. Pharmaceutical Research 23: 2036-49
 K Brendel, E Comets, C Laffont, F Mentré (2010). Evaluation of different tests based on observations for external model evaluation of population analyses. Journal of Pharmacokinetics and Biopharmaceutics, 37: 49-65
 E Comets, K Brendel, F Mentré (2009). Computing normalised prediction distribution errors to evaluate nonlinear mixed-effect models: the npde add-on package for R. Computer Methods and Programs in Biomedicine, 90: 154-66