Emilie Schindler, Lena E. Friberg, Mats O. Karlsson
Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
Objectives: Patient-reported outcomes, usually assessed using questionnaires, are increasingly collected during clinical trials to evaluate variables not directly quantifiable such as fatigue, health-related quality of life or pain. Due to their multi-scale nature, their analysis is challenging and item response theory (IRT) in a non-linear mixed effect modeling framework [1] offers an alternative to classical test theory using total score (TS). The aim of this analysis was to compare IRT vs TS approaches for power/sample size calculation based on longitudinal questionnaire data for different magnitudes of variability between the items’ discrimination parameter.
Methods: An IRT model was used to simulate item-level data for a 7-item questionnaire in a parallel-group trial of one placebo and one active dose arm with 1000 patients/arm and 6 occasions per patient. Each item had scores ranging from 0 to 4, the probability of each score being described by a proportional odds model. Discrimination and difficulty parameters used for simulations were obtained from IRT modelling of physical subscale of baseline Functional Assessment of Cancer Therapy-Breast (FACT-B) in metastatic breast cancer patients [2]. Four scenarios were simulated with 0%, 50%, 100% and 200% of original variability in discrimination parameters. The latent variable Di was assumed to vary over time according to the following equation: Di(t)=Di,0+(θ1*xgrp+η2)*Time, where Di,0=Di(0) is a standard normally distributed random variable, xgrp=0 in the placebo group and xgrp=1 in the treatment group. Total scores for TS analysis were calculated as the sum of simulated item responses. Monte-Carlo Mapped Power method [3] implemented in PsN software was used for power calculation.
Results: For all four scenarios, IRT approach resulted in smaller sample sizes to achieve 80% power to detect a drug effect compared to TS approach (18%, 20%, 26% and 40% fewer patients for 0%, 50%, 100% and 200% of original variability in discriminatory power, respectively). IRT was less sensitive to variability in discrimination parameters than TS.
Conclusions: The value of IRT modelling over TS approach may increase as variability in discrimination parameters across items increases.
References:
[1] Ueckert S. et al. Improved utilization of ADAS-cog assessment data through item response theory based pharmacometric modeling. Pharm Res, 2014; 31(8): p. 2152-65.
[2] Welslau M. et al. Patient-reported outcomes from EMILIA, a randomized phase 3 study of trastuzumab emtansine (T-DM1) versus capecitabine and lapatinib in human epidermal growth factor receptor 2-positive locally advanced or metastatic breast cancer. Cancer, 2014; 120(5):642-51.
[3] Vong C. et al. Rapid sample size calculations for a defined likelihood ratio test-based power in mixed effects models. AAPS J, 2012; 14(2):176-86.
Reference: PAGE 24 () Abstr 3468 [www.page-meeting.org/?abstract=3468]
Poster: Methodology - Estimation Methods