A case study of applying machine learning to improve non-invasive assessment of fibrosis stage in MASLD

Heather Collis (1), Stergios Kechagias (2), Mathias Liljeblad (3), Patrik Nasr (2), Sara Hansson (3), Mattias Ekstedt (2), Jana de Wiljes (4), Jane Knöchel (1)

1. Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg, Sweden; 2. Department of Health, Medicine, and Caring Sciences, Linköping University, Linköping, Sweden; 3. Translational Science and Experimental Medicine, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden; 4. Institute of Mathematics, Technical University of Ilmenau, Ilmenau, Germany

Introduction

Metabolic dysfunction-associated steatohepatitis (MASH), the more severe form of metabolic dysfunction-associated steatotic liver disease (MASLD), is a major cause of liver-related morbidity and mortality world-wide for which no licensed therapy is currently available [1]. MASH is formally diagnosed with liver biopsy [2] and disease progression is highly heterogenous and remains poorly understood. Currently, hepatic fibrosis stage is the best predictor for the development of liver-related outcomes and overall mortality [3-5]. As liver biopsy is an invasive procedure, one goal in the field is to validate non-invasive biomarkers for fibrosis to replace the need of a liver biopsy. The current clinical standard for non-invasively assessing fibrosis is the FIB-4 score [6]. The FIB-4 score is a combination of 4 biomarkers (age, AST, ALT, and platelets) routinely performed in the clinic. A major drawback of the FIB-4 scoring system is the use of two thresholds to distinguish patients’ fibrosis stage. This results in approximately 30% of patients with scores between these thresholds being unclassified. Thus, there is a need for a better diagnostic model for fibrosis stage to guide future clinical trials or identification of patients at risk.

Objectives:

Develop a random forest pipeline for the classification of mild vs advanced fibrosis in a MASLD cohort.
Compare the results of random forest classification to the FIB-4 score.

Methods:

We studied a MASLD clinical cohort (87 patients) which included data for more than 1000 biomarkers, including 972 biomarkers from Olink plasma proteomics [7]. We developed a random forest pipeline aiming to reduce the number of biomarkers of interest in the dataset. First, we used the Boruta feature selection algorithm [8] which aims to select “all-relevant” features. Second, we ran a range of random forest models based on different subsets of biomarkers (full dataset, Boruta selection, and data excluding proteomic biomarkers). The results of nested cross-validation showed the Boruta selection consistently outperformed other random forest models. Using the Boruta random forest model on the full dataset produced a reduced set of approximately 20 – 40 biomarkers. The classification accuracy of the full random forest, as well as the top ranked biomarker by feature selection, were evaluated as classification metrics for fibrosis stage and compared to classification results using the FIB-4 score.

Results:

The developed random forest pipeline is able to non-invasively classify all patients in the clinical cohort, whereas the calculated FIB-4 score for this cohort leaves 43% of patients unclassified. The accuracy achieved with the random forest model is comparable to the FIB-4 score excluding the indeterminant patients (AUROC 0.90, Accuracy 0.85 vs AUROC 0.80, Accuracy 0.88). The accuracy of FIB-4 drops significantly if we include the indeterminant patients (Accuracy 0.51). Using a single value threshold to ensure all patients receive a classification, EGFL7 (the top ranked biomarker by feature selection) maintained comparable accuracy results to the full random forest and FIB-4 score (AUROC 0.77, Accuracy 0.83). Sequential application of the FIB-4 score to the full cohort and EGFL7 threshold to the remaining unclassified patients further enhanced the accuracy of patient classification (Accuracy 0.86).

Conclusions:

These results indicate that random forest models can improve non-invasive fibrosis scoring systems. Such models can be used to aid clinical trial inclusion criteria and patient risk stratification. We have shown that the top ranked biomarker by feature selection can be used, unmodified, as a classification tool with a single threshold value that performs comparably to the full random forest and FIB-4 score. Further, sequential use of the FIB-4 score and EGFL7 threshold increases patient classification accuracy, highlighting that EGFL7 can be a beneficial addition to a patient risk stratification pipeline or clinical trial inclusion/exclusion criteria.

References:
[1] Younossi, Z. M. The Epidemiology of Nonalcoholic Steatohepatitis. Clin. Liver Dis. 11, 92–94 (2018).
[2] Kleiner, D. E. et al. Design and Validation of a Histological Scoring System for Nonalcoholic Fatty Liver Disease. Hepatology 41, 1313–1321 (2005).
[3] Ekstedt, M. et al. Fibrosis stage is the strongest predictor for disease-specific mortality in NAFLD after up to 33 years of follow-up. Hepatology 61, 1547–1554 (2015).
[4] Taylor, R. S. et al. Association Between Fibrosis Stage and Outcomes of Patients with Nonalcoholic Fatty Liver Disease: A Systematic Review and Meta-Analysis. Gastroenterology 158, 1611–1625 (2020).
[5] Angulo, P. et al. Liver Fibrosis, but no Other Histological Features, Associates with Long-term Outcomes of Patients with Nonalcoholic Fatty Liver Disease. Gastroenterology 149, 389–397 (2015).
[6] Sterling RK, Lissen E, Clumeck N, et al. Development of a simple noninvasive index to predict significant fibrosis in patients with HIV/HCV coinfection. Hepatology 43, 1317–1325 (2006).
[7] Assarsson, E. et al. Homogenous 96-Plex PEA Immunoassay Exhibiting High Sensitivity, Specificity, and Excellent Scalability. PloS one 9(4), e95192 (2014)
[8] Kursa, Miron B., and Witold R. Rudnicki. Feature selection with the Boruta package. Journal of statistical software 36, 1-13 (2010).

Reference: PAGE 32 (2024) Abstr 10767 [www.page-meeting.org/?abstract=10767]

Poster: Real-world data (RWD) in pharmacometrics