Dan Lu1, Pascal Chanu2, Gengbo Liu1, Victor Poon1, Joy C. Hsu1, Jin Y. Jin1, James Lu1
1 Department of clinical pharmacology, Genentech/Roche, South San Francisco, CA, USA; 2 Department of clinical pharmacology, Genentech/Roche, France
Objectives: Machine learning (ML) and deep learning (DL) approaches are evolving in the field of disease progression modeling. There are multiple approaches with pros and cons worth consideration. Two ML/DL methods for regression tasks to describe disease progression of geographic atrophy (GA), an eye disease with unmet medical need, are reported here.
Methods: GA area data from two phase III studies were used, in which no treatment effects were identified1, with study 1 used for model training and study 2 for testing. The data represent GA disease progression within ~ 3.5 years of observation. The methods of eXtreme Gradient Boosting (XGB)2 and self-normalizing neural net3 (SNN) with ensemble approach were applied. First, XGB model was built using only the baseline features (GA area and other features) to identify the important ones based on all data from training dataset. Second, both XGB and SNN models were built by including baseline features, and early measurement of GA data: the first observation post baseline and rate of change estimated from the first observation compared to baseline. The model predictive performance for change of GA area from the first observation (ΔGA) at nominal time of 672 days (~18 months after the first observation post baseline) in the test dataset was assessed by performance measures such as RSquared (R2) and Root Mean Squared Error (RMSE).
Results: First, based on feature importance analysis on XGB model using SHAP method4, the most important baseline features include baseline GA area, followed by lesion location (subfoveal/non subfoveal) and contiguity (multifocal/non multifocal), consistent with the covariates identified from the non-linear mixed effect model5. For the XGB method, the R2 and RMSE (mean ± SD, 100 bootstraps) for ΔGA at ~ 18 months (453 observations) after the first observation (median time 5.5 months) are 0.32 ± 0.06 and 1.50 ± 0.27, respectively. For SNN method, the R2 and RMSE are 0.35 ± 0.05 and 1.46 ± 0.25. The goodness-of-fit plots showed good model performance. The model simulation of GA area versus time suggested that XGB model behaved as horizontal lines at the time windows outside of observed data, while SNN model simulates a continued increasing trajectory after the last observation of each patient.
Conclusions: XGB model is efficient in identifying impacting covariates and described disease trajectory. SNN model further improved prediction accuracy over XGB, and has advantages in extrapolation outside the observed time window. Taken together, these ML and DL approaches are promising in describing and predicting GA disease progression trajectory and identify potential factors influencing faster and slower progressors, and to be compared to data from new drug candidates to assess treatment effects. The similar methodology could be applied to other degenerative disease areas.
References:
[1] Holz F., et al. JAMA Ophthalmol. 2018;136(6):666-677
[2] Chen T., et al. https://arxiv.org/abs/1603.02754 2016
[3] Klambauer G., et al. https://arxiv.org/abs/1706.02515 2017
[4] Lundberg S., et al. Nat Mach Intell 2020
[5] Chanu P., et al. A disease progression model for geographic atrophy. PAGE 2019
Reference: PAGE 29 (2021) Abstr 9624 [www.page-meeting.org/?abstract=9624]
Poster: Methodology - New Modelling Approaches