Modelling development for count data: NONMEM vs R

Stefano Zamuner, Tarjinder Sahota, Lia Liefaard

Clinical Pharmacology Modelling and Simulation, GSK, UK

Objectives: To explore key modeling development features for count data analysis using R and NONMEM including data exploration and model diagnostics.

Different models can be applied to count data; the simplest model assumes a Poisson distribution, where the mean is equal to the variance (equidispersion) [1]. If the variance is greater than the mean, the data is considered overdispersed, which can be modelled in multiple ways, e.g. inclusion of a between subject variability (BSV) term on lambda (mean of counts) when repeated observations are available, or with a Negative Binomial distribution [2, 3]. The current work provides recommendations for simulation and estimation of different count data distributions in R and NONMEM including model diagnostics.

Methods: Data were simulated using R (rpois function). Two different cases were explored: a) repeated observations with constant hazard and BSV in lambda (assuming CV of 30% or 100%) b) a dose response model where the hazard is a function of the dose (Emax model). Model fitting was performed with Poisson, Negative Binomial and Poisson with BSV (mixed-effects model) models, using the R functions, glm, glm.nb and glmer (“lme4” package) respectively. The same models were fit in NONMEM 7.2.0 using the Laplacian estimation method [4]. Numerical stability, -2LL, AIC and bias in parameter estimates were compared. Bootstraps were performed to assess standard error estimates.

Results: For the first case study, the mixed effects model was consistently selected using AIC and likelihood ratio test (-2LL) as model selection criteria. This model was in line with the simulated model and suggests that model selection strategies based on log likelihood ratio tests or AIC criteria are sufficient to determine the underlying structural and random effects model.

Bias in parameter estimates was model dependent and consistent across software. In R, both glm and glm.nb, but not glmer appeared to significantly underestimate the standard errors of parameters as compared to bootstrap results.

Standard errors reported by NONMEM using the $COV routine provided more accurate standard errors for all models relative to R. NONMEM and R gave similar results with respect to OFV and parameter estimates.

Conclusions: The results of this analysis show that R and NONMEM are both adequate to describe longitudinal count data with constant hazard.

References:
[1] Ines Paule, Pascal Girard, Gilles Freyer, Michel Tod. Pharmacodynamic Models for Discrete Data, Clin Pharmacokinet (2012), 51:767–786.
[2] Iñaki F. Trocóniz, Elodie L. Plan, Raymond Miller, Mats O. Karlsson. Modelling overdispersion and Markovian features in count data. Clin Pharmacokinet (2009), 36:461–477.
[3] Elodie L Plan. Modeling and Simulation of Count Data. CPT Pharmacometrics Syst. Pharmacol. (2014) 3, e129.
[4] Elodie Plan, Alan Maloney, Iñaki F. Trocóniz, Mats O. Karlsson. Maximum Likelihood Estimation Methods: Performances in Count Response Models Population Parameters. PAGE 17 (2008) Abstr 1372

Reference: PAGE 24 (2015) Abstr 3629 [www.page-meeting.org/?abstract=3629]

Poster: Methodology - Model Evaluation