C-19 Marc Cerou Performance of npde for the evaluation of joint model with time to event data

Marc Cerou (1,2,3), Marylore Chenel (3), Emmanuelle Comets (1,2)

(1) Inserm, IAME, UMR 1137, University Paris Diderot, Sorbonne Paris Cité, France, (2) Inserm, CIC 1414, University Rennes-1, France, (3) Institut de Recherches Internationales Servier, France

Objectives:

Joint models are increasingly used in clinical trials. An important part of model building is to properly assess the descriptive and predictive ability of these models. Normalised prediction discrepancies (npd) and normalised prediction distribution errors (npde) have been developed to evaluate graphically and statistically non-linear mixed effect models for continuous responses [1]. In the present work, we extend npd to time-to-event (TTE) models [2].

The aims of this work were to:

develop npd for TTE data and evaluate their performance on a simulated example

evaluate the performance of the combined test for joint longitudinal and TTE models

Methods:

Let V denote a dataset. In this work we first consider a dataset with only TTE observations and then a dataset with both longitudinal and TTE observations. The null hypothesis H_{0} is that observations in V can be described by a model. Prediction discrepancies (pd) are defined as the quantile of the observation within its predictive distribution. In nonlinear mixed effect models (NLME), the predictive distribution is approximated by Monte-Carlo simulations (MCs). The pd for unobserved (censored) event times are imputed in a uniform distribution based on the model prediction of the probability of censoring [2], using a similar method as the one developed to handle data under the lower quantification limit (LOQ) [3]. [1] Under H_{0}, the pd follow a uniform U(0,1). They can be transformed back to a normal N(0,1) distribution using the inverse normal cumulative function, and we test their distribution either through a Kolmogorov-Smirnov test or a combined test of normality, mean and variance [1].

In joint models, we compute separately the pd for TTE data and the prediction distribution error (pde) for the longitudinal data, which are obtained after decorrelating simulated and observed data [1]. We then propose to use a combined test, combining the p-values of the tests on longitudinal data and on TTE data, adjusted with a Bonferroni correction.

We evaluated the performance of npd/npde through two simulation studies inspired by [4]. Desmée et al. characterised the relationship between the prostate specific antigen biomarker (PSA) and survival in 500 prostate cancer patients via joint modelling. We simulated event times and PSA trajectories from the joint model, for different sample sizes (50, 100, and 200) and evaluated the type I error and power of npd/npde to detect different types of model misspecifications. In the first simulation study, we assumed that the PSA model is correct and consider only TTE data. We tested two types of misspecification on the TTE model: PSA impact on survival and on the baseline hazard model. In the second simulation study, we considered both longitudinal and TTE data. We assumed that the TTE model was correct and tested misspecifications on PSA model’s parameters.

Results:

In the first simulation study and in both cases of deviations for the TTE component, we found that the type I error associated with the npd-TTE was close to the expected 5% for all sample sizes. They were able to detect a model and parameter misspecification. In both cases of deviations, censoring the TTE data led to a decrease of the power. This is expected because in that case pd are imputed under the model being tested.

In the second simulation study, the npde-PSA were able to detect misspecifications in the PSA model, with a type I error close to 5%. A misspecification on an influential parameter of the PSA model was captured by both npde-PSA and npd-TTE. This suggests that, if a test rejects the survival model, we have to look at whether the problem may not come from the longitudinal model.

For all types of misspecifications, the type I error of the combined test was found to be close to the expected 5%. The power of the combined test to detect model misspecifications increased with the difference from the true model and as expected, with sample size. Graphically the power increase can be related to larger differences in the shape of the survival function or PSA evolution.

Conclusions: npd can be readily extended for event data by imputing the pd for censored event under the model [1]. The combined test for multiple responses performed well with an adequate type I error, and was quite sensitive to alternative models tested.

Acknowledgments: The authors thank Karl Brendel for his valuable contribution to this work

References: [1] Brendel K, Comets E, Laffont C, Laveille C, Mentré F. (2006). Metrics for external model evaluation with an application to the population pharmacokinetics of gliclazide. Pharmaceutical research, 23(9), 2036-2049. [2] Cerou M, Lavielle M, Brendel K, Chenel M, Comets E. (2018). Development and performance of npde for the evaluation of time-to-event models. Pharmaceutical research, 35(2), 30 [3] Nguyen T. H. T., Comets E, Mentré F. (2012). Extension of NPDE for evaluation of nonlinear mixed effect models in presence of data below the quantification limit with applications to HIV dynamic model. Journal of pharmacokinetics and pharmacodynamics, 39(5), 499-518. [4] Desmée S, Mentré F, Veyrat-Follet C, Guedj J. (2015). Nonlinear mixed-effect models for prostate-specific antigen kinetics and link with survival in the context of metastatic prostate cancer: a comparison by simulation of two-stage and joint approaches. The AAPS journal, 17(3), 691-699.