Andreas Krause, Florilene Bouisset, Amy Racine
Novartis Pharma AG, Biostatistics/Modeling and Simulation, P.O. Box, 4002 Basel, Switzerland
Objectives: Clinical studies mostly generate incomplete data. The fraction of non-available data can range from small to substantial, and the reasons can be manifold: The data was not recorded, the data was lost on its way to the clinical database, or patients discontinued treatment. In all those cases there is no problem in analyzing the complete data only if the missingness is completely random.
However, if partial or missing data is dependent on other variables, that process must be modeled in order to correct for the bias that would otherwise result.
This poster outlines a recent study planning using modeling and simulation. In the anticipated scenario,
– 30 percent of patients enrolled are perceived to discontinue treatment before the end of the study
– the probability of discontinuation depends on the well-being (or not) of a patient
In other words, it was anticipated that a patient that does not respond well to treatment has a higher likelihood of discontinuing the treatment.
Methods: The model approach is a longitudinal mixed effects model with some model assumptions on the discontinuation process. Subjects are assumed to have a linear disease progression with different slopes for treatment groups. If a subject discontinues treatment, the subject will instantaneously switch from a treatment profile to a placebo profile. To enable building a placebo model, subjects discontinuing the drug will be asked to continue with the scheduled visits.
The anticipated study setup, conduct, and disease progression were simulated 1,000 times, and discontinuation was simulated with probabilities varying due to individual disease progression. The individual probabilities of discontinuation are based on a function of the random effects, the individual deviations from the population average. Subjects with better than population average disease progression were assigned lower chances of discontinuation, subjects with worse than population average disease progression were assigned higher probabilities of discontinuation. Discontinuation was simulated and the effect on the final parameter estimates were assessed.
The evaluations of the incomplete data as observed (simulated) are contrasted against the known true complete data evaluations using a mixed effects model and a standard per-protocol analysis based on complete patient records only (since drug discontinuation is regarded as protocol violation).
Results: In this particular setup we show that using a per protocol analysis results in an underestimation of the treatment effect of 50 percent with a corresponding loss in power.
Using the model-based approach, the primary parameter (difference of treatment to placebo) was estimated as 117 percent of the value as simulated (averaged over 1,000 simulations), whereas the per protocol estimate yielded an estimate of 49.6 percent of the simulated value. Powers (fractions of hypothesis rejections in the simulations) were estimated as 80 and 61 percent, respectively.
For further study parameters, the model-based approach is much closer to the simulated values, whereas the per protocol analysis consistently underestimates by 50 percent.
Interestingly, only a fraction of all model estimation runs converged, between 20 and 100 percent depending on the parameter. The accuracy (deviation of the model-based estimates from the original values) increases with the fraction of converged model runs (using SAS PROC NLMIXED). This result seems to suggest that whether or not a model estimation converges successfully might partially depend on the underlying model parameter values.
Conclusions: If a study yields more than just a few incomplete data records, it must be investigated if the reason for missingness is related to the absence or presence of a treatment effect (or other circumstances). Presence of non-ignorable missingness leads to modeling the process of discontinuation of treatment.
Using a per protocol analysis (complete cases only) or imputation by LOCF (Last Observation Carried Forward) can result in severely wrong results, as shown in this particular study setup where the treatment effect would be underestimated by about 50 percent.
References:
[1] Little, R.J.A. and Rubin, D.B. (1987), Statistical Analysis with Missing Data. J. Wiley & Sons, New York.
[2] Rubin, D.B. (1987), Multiple Imputation for Nonresponse in Surveys. J. Wiley & Sons, New York.
[3] Rubin, D.B. (1996), Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Association, 91, 473-89.
[4] Schafer, J.L. (1997), Analysis of Incomplete Multivariate Data. Chapman & Hall, London.
[5] The multiple imputation FAQ page. http://www.stat.psu.edu/~jls/mifaq.html
Reference: PAGE 14 (2005) Abstr 742 [www.page-meeting.org/?abstract=742]
Poster: poster