Patterns and power for the visual predictive check
Wilkins, Justin J (1), Karlsson, Mats O (1), Jonsson, E Niclas (2)
(1) Division of Pharmacokinetics and Drug Therapy, Department of Pharmaceutical Biosciences, Uppsala University, Box 591, SE-751 24 Uppsala, Sweden; (2) F. Hoffman La-Roche, PDMP 15/1.052, Grenzacherstrasse 124, CH-4070 Basel, SwitzerlandIntroduction. The visual predictive check (VPC), a technique whereby model appropriateness is tested by means of prediction intervals (PIs) derived by simulation from final model parameter estimates, has been suggested as an alternative to standard diagnostic plots (1). The VPC is a means of ensuring that a given model adequately describes the data used to develop it, but is less effective at identifying specific sources of model misspecification.The objectives of this study were to use our implementation of the VPC to assess the extent and sources of variability in the number of observed data points lying outside specified prediction intervals, to assess any relationships between such variability and numbers of observations and individuals in the data, to assess the power of the method to detect model misspecification, and to investigate whether specific classes of misspecification could be identified from trends in predictive check output.
Methods. NONMEM was used to simulate new datasets (between 300 and 2 000) from a simple one-compartmental oral pharmacokinetic model, using several design permutations utilizing varying ratios of observations to subjects with regular and random sampling schedules. Nonparametric PIs were generated for each of the original observations, corresponding to the 10th, 20th, 30th, 40th, 50th, 60th, 70th, 80th, 90th, 95th and 99th percentiles in each case. Specific model misspecifications were introduced to the basic model used to simulate the ‘correct’ dataset. The resulting ‘incorrect’ model was used to fit the ‘correct’ data, producing parameter estimates that were used to simulate prediction intervals. The numbers of observations outside the upper and lower limits of these ranges were compared with the expected number in each case.
Results. The number of outliers at different percentiles was sensitive to data composition where interindividual variability was misspecified. For the example of IIV on clearance, where subject numbers were low but observations plentiful, residual error was dominant over IIV, and systematic underprediction could be discerned at tighter prediction intervals (10%-70%). However, where IIV was the dominant source of variability (many subjects with few observations each), the simulated prediction intervals shrank and more outliers were seen in wider PIs (80%-95%). Misspecified residual error had the general effect of producing, distinctive underprediction at early time points, apparent from dramatically biased distributions of outliers at the wider PIs, followed by significant overprediction at later time points.
Conclusions. The VPC was sensitive to the ratio of individuals to observations in the data in the models tested, producing varying patterns of outliers at different prediction intervals depending on this relationship. Distinctive outlier and residual patterns were identified for specific classes of model misspecification, which may make it possible to automatically detect such cases in the future through software.
 Holford N. The Visual Predictive Check – Superiority to Standard Diagnostic (Rorschach) Plots. PAGE 14 (2005) Abstr 738 [www.page-meeting.org/?abstract=738]