Joint modeling of biomarkers dynamics and survival with competing risks to predict the prognostic of patients hospitalized with severe infectious diseases
Alexandra Lavalley-Morelle (1), France Mentré (1,2), Emmanuelle Comets (1,3), Jimmy Mullaert (1,2)
(1) Université Paris Cité, IAME, Inserm, F-75018 Paris, France, (2) Department of Epidemiology, Biostatistics and Clinical Research, AP-HP, Bichat-Claude Bernard University Hospital, F-75018 Paris, France, (3) INSERM, CIC1414, Université Rennes-1, F-35000, Rennes, France
Objectives: Nowadays, most hospitals are equipped with laboratory information systems that routinely gather all results of biological analyses prescribed by the physicians. Consecutive biological observations represent a huge amount of information that can be used in a joint model to provide individual dynamic predictions of patient prognosis to guide therapeutic management and to forecast hospital needs.
Joint models proposed in the literature usually involve (i) a linear mixed-effects model describing the longitudinal evolution of a single biomarker and (ii) a competing risks model considering discharge from hospital as a competing event [1]. However, biomarker dynamics often follow nonlinear trends, not well captured by linear models.
Moreover, shifting the modeling from one single biomarker to potentially hundreds could certainly improve prediction accuracy, but also comes with statistical issues. Some authors described computational and identifiability issues due to the high number of random effects when multiple correlated longitudinal models are jointly estimated with a survival model [2]. This probably explains why published models are mostly limited to longitudinal models with at most two biomarkers [3] and highlights the need of methods for variable selection in this context.
To address these questions, we relied on two datasets: (1) French data warehouse (OUTCOMEREA) including 6046 patients admitted in Intensive Care Unit (ICU) for sepsis between 1997 and 2015, with a daily follow-up of the Sequential Organ Failure Assessment (SOFA) score until death or discharge, and (2) French database (RisCoV) including 327 patients hospitalized for severe COVID-19 infection followed until death or discharge, with up to 59 biomarkers variably followed over time.
1- Applied objectives
- Develop a full parametric nonlinear joint model involving the SOFA score evolution to predict the death of patients admitted in ICU for sepsis in the OUTCOMEREA database
- Develop a multivariable joint model and a strategy to select a subset of biomarkers to predict the death of patients hospitalized for COVID-19 infection in the RisCoV database
2- Methodological objectives
- Evaluate the SAEM algorithm implemented in Monolix software for nonlinear joint models under competing risks with a rich design (following OUTCOMEREA)
- Evaluate this algorithm for multivariable joint models under competing risks with sparser design (following RisCoV) and assess the validity of the proposed selection strategy
3- Dissemination objectives
- Extend the R saemix package [4] for joint models with possibly multiple longitudinal biomarkers and competing risks models
Methods:
1- Real case studies
- We combine a parametric nonlinear mixed-effects model (NLMEM) to model the SOFA evolution and parametric subdistribution hazard models adjusted on age, to model the risk of death considering the discharge as a competing event. The nonlinear structural function allows monotone as well as non-monotone SOFA trends [5].
- Each biomarker is modeled by parametric linear or nonlinear mixed-effects model jointly estimated with a competing risks model. The first biomarker selection is based on two criteria: the quality of the fit (small residual error, population parameters adequately estimated) and the significance of the link between the longitudinal and the survival parts. Selected biomarkers are included in a multivariable joint model and a backward selection process is applied: at each step, the biomarker having the highest p-value Wald test for the coefficient linking the longitudinal model and the risk of death is removed from the analysis (stop when all p-values are < 5%).
Parameters are estimated by maximization of the likelihood using the SAEM algorithm implemented in Monolix software. Dynamic predictions given various landmark times are derived, and assessed using time-dependent AUC [6].
2- Simulation studies
- We simulate a single biomarker, daily measured and linked to the risk of death, using OUTCOMEREA design. To evaluate the estimation approach, we report the Relative Bias (RB) and Relative Root Mean Squared Errors (RRMSE) of joint model parameter estimates.
- We simulate 7 biomarkers: 3 linked to the risk of death, 2 not linked but correlated with the previous ones, and 2 not linked and independent, using RisCoV design. We evaluate the estimation approach under the true model. To assess the performances of the proposed backward selection algorithm, we report the proportion of scenarios finding the good combination.
3- Dissemination
- We extend the R saemix package initially designed to fit NLMEM to the case of joint models with continuous and/or time-to-event outcomes (including competing risks). We evaluate its estimation performances on previous simulated datasets and compare it with those Monolix
Results:
1- Real case studies
- A raise of the SOFA score is significantly associated with an increase of the risk of death and a decrease of the risk of discharge [5]. The joint modeling approach leads to good predictive performances at day 30 for all landmark times: for landmark = D1, AUC [95% CI] = 0.72 [0.69,0.75] and for landmark = D15, AUC [95% CI] = 0.90 [0.88,0.92].
- Over the 59 biomarkers, 3 are included in the final multivariable joint model: neutrophils, arterial pH and C-reactive protein. The predictive performances are good when sufficient information is available: landmark = D6, AUC [95% CI] = 0.84 [0.75,0.93].
2- Simulation studies
- Under rich design, RB and RRMSE are low for all parameters except for covariate effects having a higher variability. The link coefficient which is of particular interest is precisely and accurately estimated (RB = 2% and RRMSE = 9%).
- Under sparser design with multiple biomarkers, RB and RRMSE are relatively low for all parameters; for the three link coefficients, RB ≈ 15% and RRMSE ≈ 30%. The backward strategy leads to good performances: 74% of the replicates found the set of biomarkers truly associated with the risk of death, and 20% only 2 biomarkers truly associated.
3- Dissemination
- Similar estimation performances are obtained with the SAEM algorithm extended for multi-response in R saemix package.
Conclusions: We showed the benefits of joint modeling through two real case studies of severe infections: the follow-up of the SOFA score is predictive of survival of patients admitted in ICU for sepsis, and the dynamics of neutrophils, arterial pH and CRP are predictive of survival of patients hospitalized for COVID-19 infection. We also showed that the SAEM algorithm implemented in Monolix software provides unbiased estimates of such model parameters under rich or sparser design. We finally extended the R saemix package to the case of joint models and are currently working on integrating a LASSO penalization, as backward selection methods are known to be outperformed by penalized regression methods [7].
References:
[1] Musoro JZ, Zwinderman AH, Abu-Hanna A, Bosman R, Geskus RB. Dynamic prediction of mortality among patients in intensive care using the sequential organ failure assessment (SOFA) score: a joint competing risk survival and longitudinal modeling approach. Statistica Neerlandica, 2018, 72(1):34-47
[2] Shen F, Li L. Backward joint model and dynamic prediction of survival with multivariate longitudinal data. Statistics in Medicine, 2021, 40(20):4395–4409
[3] Andrinopoulou E-R, Rizopoulos D, Takkenberg J, Lesaffre E. Combined dynamic predictions using joint models of two longitudinal outcomes and competing risk data. Statistical Methods in Medical Research, 2017, 26(4):1787–1801
[4] Comets E, Lavenu A, Lavielle M. Parameter estimation in nonlinear mixed effect models using saemix, an R implementation of the SAEM algorithm, Journal of Statistical Software, 2017, 80(3):1-41
[5] Lavalley-Morelle A, Timsit JF, Mentré F, Mullaert J. The OUTCOMEREA network. Joint modeling under competing risks: Application to survival prediction in patients admitted in Intensive Care Unit for sepsis with daily Sequential Organ Failure Assessment score assessments. CPT: Pharmacometrics & Systems Pharmacology, 2022, 11(11):1472-1484
[6] Blanche P, Proust-Lima C, Loubère L, Berr C, Dartigues JF, Jacqmin-Gadda H. Quantifying and comparing dynamic predictive accuracy of joint models for longitudinal marker and time-to-event in presence of censoring and competing risks. Biometrics, 2015, 71(1):102-13
[7] Tibshirani R. Regression Shrinkage and Selection Via the Lasso, Journal of the Royal Statistical Society, 1996, 58(1):267-288