Application of Item Response Theory to ADAS-cog Scores Modelling in Alzheimerís Disease
Sebastian Ueckert (1), Elodie L. Plan (2), Kaori Ito (3), Mats O. Karlsson (1), Brian Corrigan (3) and Andrew C. Hooker (1)
(1) Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden; (2) Metrum Research Group, Tariffville, CT, USA; (3) Global Clinical Pharmacology, Pfizer Inc, Groton, CT, USA
Objectives: The challenges in the development of new therapeutic agents for Alzheimer’s Disease (AD) become apparent through the high number of failed late phase trials . Despite an increasing interest in biomarkers, cognition remains the primary regulatory accepted clinical outcome. The most frequently used test, ADAS-cog, consists of a broad spectrum of tasks that test different components of cognition . The total ADAS-cog score is obtained by rating a subject’s performance in each of the subtests and summing up the resulting subscores to yield an overall assessment. In turn, pharmacometric models traditionally describe Alzheimer’s disease progression using this summary score [3,4]. An alternative approach, explored in this work, is to model each subscore separately and link the model subcomponents to a common unobserved variable “cognitive disability”. In psychometrics, this method is used to study the sensitivity of items in standardized educational tests, and the approach is referred to as item response theory (IRT) . The aims of this work were a) to develop an IRT model for ADAS-cog scores, b) to compare the performance of a longitudinal model using item level or summary data and c) to apply optimal design to the selection of the most informative battery of tests in a given population.
1 ADAS-cog IRT Model
Baseline ADAS-cog assessments with item level data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI)  and the Coalition Against Major Diseases (CAMD)  databases were used for this part of the project. The resulting dataset used in this work consisted of 2651 subjects from 7 studies with a total of 152313 baseline observations. For each subtest of the cognitive assessment, depending on the nature of the arising data, a binary, count or ordered categorical model was developed, describing the probability of a failed test outcome as a function of the latent cognitive disability. All parameters considered as characterizing the individual test item were expressed as fixed effects, whereas the cognitive disability was modeled as a subject specific random effect. The model performance was evaluated through comparison of observed and simulated data for each item.
2 Longitudinal Model Comparison
Based on the accessibility of study protocol information, one study from the CAMD database was selected for a complete longitudinal analysis. The available data consisted of the placebo arm of an 18-month AD trial with a total of 322 patients and 7 ADAS-cog assessments per patient. The basic longitudinal model, without covariates, published by Ito et al. (3) was applied to a) the summary ADAS-cog score and b) the hidden cognitive disability variable. The model adequacy was assessed through visual predictive checks on both item and summary levels and parameter precision was evaluated through a posterior predictive check of the mean ADAS-cog score at baseline and the mean annual change in ADAS-cog.
3 Optimal Test Design
Based on the developed IRT model, the Fisher information for estimating a patient’s cognitive disability was calculated for each item in the ADAS-cog test. The test items were ranked by information content within a mild cognitively impaired (MCI) and a mild AD (mAD) patient population. Furthermore, the additional amount of information added to an ADAS-cog assessment through incorporation of additional components (“delayed word recall” and “number cancellation” ) were evaluated in both populations.
1 ADAS-cog IRT Model
The final ADAS-cog IRT model consisted of 39 binary, 5 binomial, 1 generalized Poisson and 5 ordered categorical submodels with a total of 166 parameters. Simulations from the individual models were in excellent agreement with the observed data. All but one estimated characteristic curves for the test items were well defined with a low failure probability for healthy subjects and high failure probability for severely impaired patients. Only the characteristic curve for the task “state your name” was essentially flat.
2 Longitudinal Model Comparison
Without re-estimation of the item specific parameters, the IRT model described the longitudinal nature of most of the test subcomponents satisfactorily. On visual predictive checks, no difference between the prediction intervals obtained with the summary score model and the IRT model was observed, however the 95% confidence interval of both the mean baseline score and the annual change was narrower with the IRT model.
3 Optimal Test Design
The information content ranking of the subcomponents in a classical ADAS-cog assessment differed between the two patient populations. For the MCI population the word recall component was most informative, while for the mAD population the orientation component carried most information. Similarly, there was an apparent difference in the relative amount of information added by including the delayed word recall and number cancellation components. With the additional components, the information content of the complete ADAS-cog assessment increased by 78% in the MCI population compared to only 35% for the mAD population.
Conclusions: Utilizing IRT, the information available in clinical trial databases can be used to characterize the relationships between the individual items of a cognitive assessment. The resulting mathematical description can serve as a platform for future trials with the advantages of a) a more exact replication of the score distribution, b) an implicit mechanism for handling missing information, and c) the ability to easily combine data from different ADAS-cog variants. Parameter estimates obtained through application of the IRT model to longitudinal clinical trial data were more precise than the ones obtained through a summary score-based model, indicating a higher probability to detect changes due to a drug effect. Another feature demonstrated in this work is the capability to quantify the information content of the individual components of a cognitive assessment and the possibility to adapt a cognitive assessment specific to the patient populations’ degree of disability. A population specific test would not only be more sensitive to changes due to disease progression or drug effect, but also reduce the assessment time and thus burden for the patient. In addition, IRT also allows combination of different cognitive assessments, like the mini-mental state examination (MMSE), into one common pharmacometric model. Many of the benefits of using item-level models are not exclusive to AD, but can easily be extended to other disease areas where summary scores constitute an important clinical measure, e.g. in Parkinson’s disease or rheumatoid arthritis.
 Becker RE, Greig NH. Alzheimer’s Disease Drug Development in 2008 and Beyond: Problems and Opportunities. Curr Alzheimer Res. 2008 Aug;5(4):346–57.
 Rosen WG, Mohs RC, Davis KL. A new rating scale for Alzheimer’s disease. Am J Psychiatry. 1984 Nov;141(11):1356–64.
 Ito K, Corrigan B, Zhao Q, French J, Miller R, Soares H, et al. Disease progression model for cognitive deterioration from Alzheimer’s Disease Neuroimaging Initiative database. Alzheimer’s and Dementia. 2011 Mar;7(2):151–60.
 Samtani MN, Farnum M, Lobanov V, Yang E, Raghavan N, Dibernardo A, et al. An Improved Model for Disease Progression in Patients From the Alzheimer’s Disease Neuroimaging Initiative. J Clin Pharmacol [Internet]. 2011 Jun 9 [cited 2012 Feb 21]; Available from: http://www.ncbi.nlm.nih.gov/pubmed/21659625
 Hambleton RK, Swaminathan H, Rogers HJ. Fundamentals of item response theory. SAGE; 1991.
 ADNI (Alzheimer’s Disease Neuroimaging Initiative). Available from: http://www.adni-info.org/
 CAMD (Coalition Against Major Disease). Available from: http://www.c-path.org
 Mohs RC, Knopman D, Petersen RC, Ferris SH, Ernesto C, Grundman M, et al. Development of cognitive instruments for use in clinical trials of antidementia drugs: additions to the Alzheimer’s Disease Assessment Scale that broaden its scope. The Alzheimer’s Disease Cooperative Study. Alzheimer Dis Assoc Disord. 1997;11 Suppl 2:S13–21.