Pavan Vajjah Novel graphical diagnostics for assessing fit of logistic regression models

Venkata Pavan Kumar Vajjah, Stephen B Duffull

School of Pharmacy, University of Otago, Dunedin, New Zealand

Background:

Assessment of goodness of fit of a model to the data set is essential to ensure the model provides a reasonable description of the events seen. In this setting graphical diagnostics, such as visual predictive checks, have an advantage over numerical criteria, such as minus twice the log-likelihood, for assessing model fit since the latter cannot determine whether the model accurately describes the data. For logistic regression a common graphical diagnostic used to assess model fit is binning the data and comparing the empirical probability of an event in each bin to the model predicted probability for the mean covariate value in the bin. Although intuitively appealing this method, termed simple binning, may not have useful properties for diagnosing model problems when the study design is unbalanced.

Objective:

To develop graphical diagnostics to assess the fit of logistic regression models.

Methods:

Study design: Three different types of study designs were considered. Design 1: Studies which were balanced on events (y-axis) and dose (x-axis covariate); Design 2: studies balanced on only events but unbalanced on dose. Design 3: Studies that are unbalanced on both events and dose.

Simulation:Each of the simulated data sets consisted of 500 subjects. The administered dose was the only covariate and could be 0, 1, 5, 10 and 20 units for design 1, and could be any integer from 0 to 20 for designs 2 and 3. The number of individuals per dose level was equal for design 1 and unequal in designs 2 and 3. The number of events was approximately 50% for designs 1 and 2 and approximately 10% in the case of design 3. The data were simulated with the dose being related to the outcome according to E_{max} model on the logit scale as shown in equation 1.

In the equation E_{0} and E_{max} are the baseline and maximum probability (π) of having an event and ED_{50} is the dose (D) at which probability of event is half E_{max}. The values of π (E_{0}), π (E_{max}) and EC_{50} were 0.2, 0.9 and 5 for designs 1 and 2 and 0.05, 0.85 and 5 for design 3. The coefficients of variation of the parameters for simulation were 15%. An E_{max} model was used as the model since in PKPD it is common that the probability of an event asymptotes below 1.0. Thirty data sets were simulated using MATLAB. This number was chosen to provide a 90% chance that both an excellent case and worst case visual diagnostic would be seen.

Estimation: Estimation was performed in WinBUGS 1.4.3. All the data sets were estimated using the E_{max} model (correct model) and a linear model (wrong model) with dose as the only covariate and using a logistic transformation to the probability domain.

Diagnostics: We propose 2 diagnostics (1) random binning and (2) a simplified Bayes marginal model plot^{1}.

(1).The idea behind random binning is to generate a distribution of empirical probabilities at various dose levels. This is achieved by randomly binning the data set based on dose or number of individuals to produce 1000 different sets of bins. For each of these 1000 random sets of bins the empirical probabilities in each bin is estimated. The estimated empirical probabilities and model predictions are plotted to visually inspect the model fit.

(2) In the case of simplified Bayes marginal model plots, the hypothesis is that ‘if the model describes data, then if we simulate ‘n' observations, from the posterior distribution of model then the spline should be one of those observations'. The methodology follows. A linear spline was fitted to the data with up to a maximum of two estimated knots using WinBUGS 1.4.3. This was presumed the best empirical description of the data. The posterior distribution of the fits of the E_{max} and linear models were then compared to the spline and the level of visual agreement between them assessed. The above diagnostics are compared with simple binning.

Results and discussion:

For all designs the proposed diagnostics performed at least as well or better than simple binning. In case of design 1 where both the covariate and event space are balanced then random binning and simple (conventional) binning are the same and provide good diagnostic features. In the case of designs 2 and 3 random binning and simplified Bayes marginal model plots were superior in assessing the model fit when compared to simple binning. In the case of simple binning examples were seen where the wrong model was preferred and also where the correct model would have been completely discounted as an acceptable descriptor of the data. For the completely unbalanced scenario (design 3) there were cases where the simplified Bayes marginal model plots provided superior discriminatory ability to random binning. In all cases for design 3 random binning was superior to simple binning.The main limitation of simplified bayes marginal plots are that they require additional computation. The above diagnostics have been tested for fixed effects models but can be extended to mixed effects models.

Conclusion:

Simple binning fails to provide either the ability to consistently identify the correct model or the ability to identify model deficiencies when the study design is unbalanced. Random binning and the Bayes marginal model plots provid good visual assessment of model performance.

Reference: [1]. Pardoe I, Cook D R. A Graphical Method for Assessing the Fit of a Logistic Regression Model; The American Statistician.2002, 56(4): 263-272.