**Normalized Prediction Distribution Error for the Evaluation of Nonlinear Mixed-Models**

Brendel K1,2, Comets E1, Laffont C.M2, Mentré F1,3.

(1)INSERM U738, Paris, France; University Paris 7, Paris, France;(2)Institut de recherches internationales Servier, Courbevoie, France;(3)AP-HP, Bichat Hospital, Paris, France.

**Introduction: **Although population pharmacokinetic and/or pharmacodynamic model evaluation is recommended by regulatory authorities, there is no consensus today on the appropriate approach to assess a population model. We have also shown in a recent literature survey [1] that model evaluation was not appropriately performed in most published population pharmacokinetic-pharmacodynamic analyses. In this context, we describe a new metric, that can be used for model evaluation in population PK or PD analyses. Our objectives were firstly to illustrate this metric by proposing different tests and graphs and secondly to evaluate it by simulation for a pharmacokinetic model with or without covariates.

**Definition of the Normalized Prediction Distribution Error (NPDE):** The null hypothesis (H0) is that data in the validation dataset can be described by a given model. Let MB be a model built from a dataset B and V a validation dataset.

Among the different approaches proposed in the literature to evaluate a population model, standardised prediction errors (computed in NONMEM through WRES) are frequently used, but they are computed using a first-order approximation. In this context, we developed a metric called Normalized Prediction Distribution Error (NPDE) based on the whole predictive distribution. For each observation, we define the prediction discrepancy as the percentile of this observation in the whole marginal predictive distribution under H0 [2]. The predictive distribution is obtained through Monte Carlo simulations. As prediction discrepancies are correlated within an individual, we use the mean and variance of predicted observations estimated empirically from simulations to obtain uncorrelated metrics [3]. NPDE are then obtained using the inverse function of the normal cumulative density function. By construction, if H0 is true, NPDE follow a N(0, 1) distribution without any approximation and are uncorrelated within an individual.

For WRES and NPDE, we use a Wilcoxon signed rank test to test whether the mean is significantly different from 0, a Fisher test to test whether the variance is significantly different from 1, a Shapiro-Wilks test to test if the distribution is significantly different from a normal distribution and a Kolmogorov-Smirnov test to test the departure from a N(0, 1) distribution. We have to consider sequentially the four tests to decide whether to reject a validation dataset.

*a) Illustrative example*

NPDE and WRES were applied to evaluate a one compartment model built from two phase II studies with zero order absorption and first order elimination. These metrics were applied on 2 simulated validation datasets based on the design of a real phase I study: the first (Vtrue) was simulated with the parameters values estimated in MB; the other one was simulated using the same model and a bioavailability multiplied by two (Vfalse).

Even on Vtrue, WRES were found to differ significantly from a normal distribution and NPDE followed a normal distribution. The mean was not significantly different from 0 for WRES and NPDE. On Vfalse, WRES and NPDE were not found to follow a normal distribution and showed a mean significantly different from 0. In conclusion, NPDE was able to appropriately evaluate Vtrue and reject Vfalse, while WRES showed less discrimination.

*b) Evaluation of the type I error by simulations*

The model used for simulations was a one compartment model with first order absorption built from two phase II and one phase III studies. We simulated with this model (without covariates) 1000 external validation datasets according to the design of another phase III study and calculated NPDE and WRES for these simulated datasets. We evaluated the type I error of the Kolmogorov-Smirnov test for these two metrics. The simulations under H0 showed a high type I error for the Kolmogorov-Smirnov test applied to WRES, but this test presents a type I error close to 5% for NPDE.

**Evaluation of NPDE applied to a model with covariates:** We considered here covariate models and investigate the application of NPDE to these models. We used covariates of a real phase III study and generated several validation datasets under H0 and under alternative assumptions without covariates, with one continuous covariate (weight) or with one categorical covariate (sex). We proposed two approaches to evaluate models with covariates by using NPDE. The first approach uses the Spearman correlation test or the Wilcoxon test, to test the relationship between NPDE and weight or sex, respectively. The second approach tests whether the NPDE follow a N(0, 1) distribution after splitting them into different groups of values of the covariates. Regarding the application of NPDE to covariate models, Spearman and Wilcoxon tests were not significant when models and validation datasets were consistent. When validation dataset and models were not consistent, these different tests showed a significant correlation between NPDE and covariates. We also find the same results by using the Kolmogorov-Smirnov test after splitting NPDE by covariates.

**Conclusions:** We assess here by simulation the statistical properties of NPDE for evaluation of a population pharmacokinetic model in comparison with WRES. We also evaluate the ability of NPDE to detect misspecification in the covariate model.

The use of NPDE over WRES is recommended for model evaluation. NPDE do not depend on an approximation of the model and have good statistical properties. They can be viewed as a good alternative way of evaluation by looking at MC simulated predictions. NPDE thus appear as a good tool to evaluate population models, with or without covariates.

**References:**

[1] Brendel K., Dartois C., Comets E., Lemenuel-Diot A., Laveille C., Tranchand B., Girard P., Laffont C.M., Mentré F. Are Population Pharmacokinetic and/or Pharmacodynamic Models Adequately Evaluated? Clin Pharmacokinet 2007; 46 (3): (2007).

[2] Mentré F., S. Escolano S. Prediction discrepancies for the evaluation of nonlinear mixed-effects models. J Pharmacokinet Pharmacodyn. 33(3):345-367 (2006).

[3] Brendel K., Comets E., Laffont C., Laveille C., Mentré F. Metrics for external model evaluation with an application to the population pharmacokinetics of gliclazide. Pharm Res. 23(9):2036-2049 (2006).