Beyond disease progression – Item response theory modelling to gain structural insights into disease facets underlying clinical score assessments
Iris K. Minichmayr (1), Elodie L. Plan (1), Benjamin Weber (2), Sebastian Ueckert (1)
(1) Department of Pharmacy, Uppsala University, Uppsala, Sweden; (2) Translational Medicine and Clinical Pharmacology, Boehringer Ingelheim Pharmaceuticals, Inc., Ridgefield, Connecticut, USA
Objectives: Diseases are commonly multifactorial, complicating the choice of appropriate diagnostic and clinical endpoints. Clinical scores, often summarised to composite scores, are widely used to capture the multi-faceted nature of complex diseases. The traditionally questionnaire-based, discrete score data—collected by healthcare professionals or as ‘patient-reported outcomes’—has gained increasing relevance in informing both treatment and drug development decisions.
Extracting increased information from the data, item response theory (IRT) models consider each score component (‘item’) . Their use marks a significant step towards appropriately capturing the true complexity of a disease in a pharmacometric model. Nonetheless, most applications of IRT merely describe the evolution of a score (e.g. [2,3]) and fail to provide additional insights attainable from the item-level observations. Questions like “What are the different facets of a disease detectable in the data”, “How do these facets evolve”, and “Which covariates are associated with each facet?” are of high clinical relevance but not addressed by the current generation of models.
This study sought to set up a novel modelling workflow to elucidate the (hidden) structure of disease aspects underlying a set of clinical score-based assessments (irrespective of any predefined categorisation, e.g. questionnaire subscales) to ultimately enhance the understanding of complex diseases. The objectives shall be exemplified by a histological liver scoring system commonly used to evaluate nonalcoholic fatty liver disease (NAFLD).
Methods: The analysis dataset originated from the public NIDDK NAFLD Adult database . Liver biopsy evaluations were based on a histological scoring system comprising 8 binary and 5 ordered categorical items, including fibrosis and the components of the NAS (NAFLD activity score = steatosis+inflammation+hepatocellular ballooning) . The population spanned the full spectrum of NAFLD (NAS 0-8) and fibrosis (0-4).
IRT models were developed based on one histological liver assessment per patient. Exploratory models, implying no clear hypothesis about the structure of the item response data, guided the development of confirmatory models, assigning the model items to specific latent variables. More precisely, item clustering based on the matrix of angles between item pairs allowed to visualise associations of items and latent disease constructs in a cluster dendrogram. The model served to explore associations between the histological liver scores and 69 diverse noninvasive biomarkers using full random effects modelling (PsN 4.10.0 ). Modelling activities were performed using R3.6.1 (package mirt) .
Results: An extended IRT model based on 13 histological features, i.e. beyond the widely used NAS and fibrosis scores, appeared most adequate to characterise latent disease aspects underlying NAFLD. In line with the nature of the items, the model comprised 5 graded-response and 8 two-parameter logit models. Exploratory models and subsequent cluster analyses revealed several clades of items in the cluster dendrogram, suggesting different disease aspects manifesting in the score assessments. Distinct disease aspects were found to underlie the four cardinal features of NAFLD, which guided the structure of the final model: steatosis, inflammation, ballooning and fibrosis were assigned to separate latent disease aspects. The residual items were associated with any hidden disease construct, even one independent from the NAS items and fibrosis. The multidimensional IRT model enabled to compute expected biopsy-based scores conditional on specific biomarker values for all 13 histological features and various covariates, revealing that different noninvasive biomarkers reflected the activity of different histological lesions associated with NAFLD (e.g. platelets – fibrosis).
Conclusions: This novel IRT modelling workflow exploits the complexity of clinical score data to gain quantitative insights into the structure of the different facets underlying a disease, including how many latent constructs are measured by the score assessments and by which items these are represented. The holistic approach leverages the benefits of IRT beyond known fields of application by providing additional insights into complex diseases like NAFLD that are routinely evaluated by score-based healthcare-related measures.
 Ueckert S. CPT Pharmacometrics Syst Pharmacol. 2018; 7(4):205-218.
 Cerou M, Peigné S, Comets E, et al. AAPS J. 2019;22(1):4.
 Krekels E, Novakovic AM, Vermeulen AM, et al. CPT Pharmacometrics Syst Pharmacol. 2017; 6(8):543-551.
 Neuschwander-Tetri BA, Clark JM, Bass NM, et al. Hepatology. 2010; 52(3):913–924.
 Kleiner DE, Brunt EM, Van Natta M, et al. Hepatology. 2005; 41(6):1313–1321.
 Karlsson MO, Nordgren R et al. https://uupharmacometrics.github.io/PsN (Accessed 2021-05-10)
 Chalmers RP. J Stat Softw. 2012; 48(6): 1-29.