IV-048

GENOME WIDE ASSOCIATION STUDY OF MODEL-DERIVED TIME TO VIRAL CLEARANCE IN COVID-19: COMPARISON OF METHODS IN A SIMULATION STUDY AND APPLICATION TO THE DISCOVERY TRIALS.

Aglaé Perrin 1, Jérémie Guedj 1, Guillaume Lingas 1, Clément Massonaud 1,2, Maude Bouscambert-Duchamp 3, Alexandre Gaymard 3,4, Nathan Peiffer-Smadja 1,5,6, Adrian Gervais 7,8, Paul Bastard 7,8,9,10, Astrid Marchal 7,8, Thibault Kerdiles 11,12, Laurent Abel 7,8,11, Anne Puel 7,8,11, Maya Hites 13, Florence Ader 14,15, France Mentré 1,2, Aurélie Cobat 7,8,9, Julie Bertrand 1, the DisCoVeRy Study Group

1 Université Paris Cité, IAME, INSERM (Paris, France), 2 Department of Epidemiology, Biostatistics and Clinical Research, Hospital Bichat, APHP (Paris, France), 3 Laboratoire de Virologie, Institut des Agents Infectieux de Lyon, Centre National de Référence des Virus Respiratoires France Sud, Hospices Civils de Lyon (Lyon, France), 4 Laboratoire VirPath, Centre International de Recherche en Infectiologie (CIRI), Inserm 1111, CNRS UMR5308, École Normale Supérieure de Lyon, UCBL (Lyon, France), 5 AP-HP, Hôpital Bichat, Service de Maladies Infectieuses et Tropicales (Paris, France), 6 National Institute for Health Research, Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London (London, UK), 7 Laboratory of Human Genetics of Infectious Diseases, Necker Branch, Institut National de la Santé et de la Recherche Médicale (INSERM) U1163, Necker Hospital for Sick Children (Paris, France), 8 Imagine Institute, Paris Cité University (Paris, France), 9 St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University (New York, USA), 10 Pediatric Hematology-Immunology and Rheumatology Unit, Necker Hospital for Sick Children, Assistance Publique-Hôpitaux de Paris (AP-HP) (Paris, France), 11 AP-HP, Département des Maladies Infectieuses et Tropicales, Hôpital Saint-Louis, Lariboisière (Paris, France), 12 Faculté de Médecine, Sorbonne Université (Paris, France), 13 Clinic of Infectious Diseases, Hôpital Universitaire de Bruxelles (Bruxelles, Belgium), 14 Département des Maladies infectieuses et tropicales, Hospices Civils de Lyon (Lyon, France), 15 Centre International de Recherche en Infectiologie (CIRI), Inserm 1111, Université Claude Bernard Lyon 1, CNRS, UMR5308, École Normale Supérieure de Lyon, Univ Lyon (Lyon, France)

Objectives :
COVID-19 severity varies widely across individuals, from asymptomatic infection to life threatening illness [1]. Demographic and clinical factors such as age, sex, and comorbidities explain part of this variability [2] and genetic factors have been identified through genome-wide association studies (GWAS) and sequencing studies, particularly in type I interferon (IFN) pathways.

As viral load dynamics was found associated to severity variability [3] and viral load profiles remain highly variable even within clinically homogeneous populations such as the DisCoVeRy 1 clinical trial (NCT 04315948; EudraCT2020-000936-23), we believe new genetic determinants of severity could be identified through GWAS of viral load dynamics.

The present work compares different approaches to perform GWAS of viral load dynamics via a simulation study with an application to DisCoVeRy 1 clinical trial.

Methods:
In the Discovery 1 trial, patients from the standard of care (SoC, N=329) and the SoC+remdesivir (N=336) arms were regularly sampled after randomization for viral load levels. Among them, N=116 and 133 provided samples for genotyping using the Illumina GSA array.

Genetic data underwent classic quality controls [4] and imputation was performed using the Helmoltz-Munich imputation server with the 1000Genomes reference panel [5].

We used the predicted individual time to viral clearance from a target-cell–limited model, developed by Lingas et al. [6] as phenotype. For patients in the SoC+remdesivir arm, the treatment effect on the clearance rate of infected cells was set to 0 before calculating time to viral clearance.

In GWAS, the standard approach relies on univariate regression [7], adjusting for population structure using principal component analysis (PCA) [8] and ridge regression leaving one chromosome out [9], with significance assessed using Bonferroni correction [10]. Given our limited sample size, we explored penalized regression methods [11], adjusting only on the leading PCA components, and selection of the top 10 signals.

We evaluated 14 combinations of these methods in a simulation study. A hundred datasets were simulated under the null hypothesis of no genetic effect (H0) and an alternative hypothesis with 3 variants accounting for about 30% of the viral clearance rate variability (H1).

For Bonferroni-based methods, we measured the family-wise error rate (FWER) under H0, while we assessed power (FWER-corrected), precision, recall, and F1-score under H1. For top 10 signals-based methods, we calculated the proportion of datasets in which the simulated SNPs were identified.

Subsequently, we analysed the genetic data and time to viral clearance estimates from the DisCoVeRy 1 trial adjusting on age, sex and auto-antibodies against type I IFN (AAB-IFN-I) status.

Results:
After quality control and imputation, N=249 patients with 10,594,815 polymorphisms and AAB-IFN-I status were available in the DisCoVeRy 1 trial.

Using Bonferroni-based methods, univariate regressions achieved FWER >10% versus ~0% for penalised regressions.

After correction for the FWER, univariate regression using only the first 3 PCA obtained powers of 0.62, 0.19 and 0.05 and F1 scores of 0.44, 0.13, and 0.03, with precision of 0.55, 0.32 and 0.08 and recall rate of 0.5, 0.22 and 0.047 for the first, second, and third causal SNPs, respectively. Penalised regression using only the first PCA obtained powers of 0.33, 0.13 and 0.03 and F1 scores of 0.32, 0.12, and 0.027, with precision of 0.315, 0.15 and 0.035 and recall rate of 0.33, 0.13 and 0.03 for the first, second, and third causal SNPs, respectively. Using ridge regression leaving one chromosome out led to worse performances.

For top-10 signals-based methods, both univariate and penalized regression approaches captured all 3 causal SNPs, achieving powers of 0.94 and 0.94, respectively for the first SNP, 0.74 and 0.80 for the second SNP, and 0.39 and 0.52 for the third SNP.

In the DisCoVeRy 1 trial, Bonferroni univariate regression identified no SNPs, while top-10 selection found 7 SNPs common to both approaches, with no overlap with previously reported loci.

Conclusions:
While the Bonferroni-threshold remains the preferred approach for GWAS, a top-signals selection strategy can provide valid results when sample size/power is limited. In our study, univariate or penalized regression combined with PCA and selection of the top 10 signals proved practical for detecting associations in high-dimensional genetic data using a model-derived phenotype. Replication and meta-analysis are planned in the Discovery 2 trial. Variants with a confirmed association will be integrated into the population viral load model.

References:
[1] Huang C. et al. Lancet. (2020)
[2] O’Driscoll M. et al. Nature. (2021)
[3] Néant N,. et al. Proceedings of the National Academy of Sciences USA. (2021)
[4] Anderson CA. Et al. Nature Protocols. (2010)
[5] Rayner NW.et al. Nature Genetics. (2024)
[6] Lingas G. et al. Journal of Antimicrobial Chemotherapy. (2022)
[7] Tam V. et al. Nature Reviews Genetics. (2019)
[8] Price AL.et al. Nature Genetics. (2006)
[9] Mbatchou J. et al. Nature Genetics. (2021)
[10] Bland JM. Et al. British Medical Journal. (1995)
[11] Tibshirani R. Journal of the Royal Statistical Society Series B Statistical Methodology. (1996)

Reference: PAGE 34 (2026) Abstr 11884 [www.page-meeting.org/?abstract=11884]

Poster: Drug/Disease Modelling - Other Topics