Leticia Arrington(1,2), Mats O. Karlsson(1), Sebastian Ueckert(1)
(1) Uppsala University, Uppsala Sweden (2) Merck& Co., Inc., Kenilworth, NJ, USA,
Introduction Item response theory is a statistical approach historically used in psychometrics that evaluates the relationship between the underlying hidden trait and item level responses measured by an assessment. Pharmacometric (i.e. longitudinal) IRT modeling acknowledges the change in these responses over time and is a valuable modeling approach for analyzing healthcare related composite assessments presenting a framework to combine different disease outcomes into a joint disease progression model. NONMEM is the classical software used for longitudinal IRT models. To date there has not been an evaluation of item parameter estimation, recovery and overall model performance in NONMEM compared to the alternative software platforms. The R package, mirt(multidimensional IRT), can be used for estimating both unidimensional and multidimensional IRT models using maximum likelihood methods; therefore making it a good candidate for comparison with NONMEM to understand aspects of model performance.
Objectives: The objective of this work was to perform a systematic comparison of two software, NONMEM and the R package mirt (multidimensional item response theory) and evaluate the estimation performance and item parameter recovery for a set of scenarios with varying number of subjects, items and latent variables.
Methods: A simulation study evaluating varying sample size and item scenarios at baseline was performed. Three sample sizes (N=50,100,500), two assessment lengths (item=5 and 20 ) and one or two latent variables were evaluated. Items were assumed to be ordered categorical with 5 categories of responses (0-4). The item parameters were randomly sampled from a lognormal distribution (0,0.52) for the discrimination parameter or uniform distribution ( -2,1) for b1 and (0.4,1) for remaining threshold parameters. Replicate (N=1000) datasets were simulated for each sample size and item scenario using mirt from the simulated item parameters. IRT model estimation was performed using the Laplace estimation method in NONMEM and stochastic EM with fixed Gaussian quadrature in mirt. Item parameter recovery was evaluated using estimation error and root mean square error (RMSE). Furthermore, model fit in terms of log-likelihood as well as run time were considered.
Results: Overall mirt and NONMEM performed similarly with relatively low estimation error (median near zero) across all scenarios, indicating reasonable item parameter recovery. As the number of subjects and items were reduced the estimation precision decreased. The discrimination parameter was more precisely estimated in mirt than NONMEM with an RMSE approximately 4x lower than that observed in NONMEM for scenarios with 100 or 50 subjects. However, for the threshold parameters NONMEM appeared to be more precise. Overall model fit (i.e. log-likelihood) was similar between mirt and NONMEM models for the scenario with 500 subjects and 5 items. When the number of subjects reduced to 100 or 50, mirt performed better in approximately 10% and 25% of the cases, respectively. Computationally mirt is faster than NONMEM. The average model estimation runtime with mirt for a simulated dataset with 500 subjects and 20 items was approximately 19 seconds compared to approximately 5 minutes in NONMEM.
Conclusions: Item parameter recovery and overall model fit were similar between NONMEM and mirt. The ease of use and speed of estimation in the existing mirt R package make it a well-suited alternative to provide item parameter estimates for a pharmacometric IRT analysis.
References:
[1] Chalmers RP. mirt: A Multidimensional Item Response Theory Package for the R Environment J Stat Soft. 2012;48(6):1-29. doi: 10.18637/jss.v048.i06.
[2] Beal S, Sheiner L, Boeckmann A, Bauer R (2009) NONMEM user’s guides (2018). Gaithersburg, MD, USA.
Reference: PAGE () Abstr 9481 [www.page-meeting.org/?abstract=9481]
Poster: Methodology - Estimation Methods