Statistical methods to detect a disease-modifying treatment effect in clinical trials with few or very few patients for rare neurological diseases
Niels Hendrickx1, France Mentré1, Alzahra Hamdan2, Mats Karlsson2, Andrew Hooker2, Andreas Traschütz3,4, Cynthia Gagnon5, Rebecca Schüle6, ARCA study group, PROSPAX consortium, EVIDENCE-RND consortium, Matthis Synofzik3,4, Emmanuelle Comets1,7
1Université Paris Cité et Université Sorbonne Paris Nord, IAME, Inserm, F-75018, 2Pharmacometrics Research Group, Department of Pharmacy, Uppsala University, 3Division Translational Genomics of Neurodegenerative Diseases, Hertie Institute for Clinical Brain Research (HIH), University of Tübingen, 4German Center for Neurodegenerative Diseases (DZNE), 5Centre de Recherche du CHUS Et du Centre de Santé Et Des Services Sociaux du Saguenay-Lac-St-Jean, Faculté de Médecine, Université de Sherbrooke, 6Hertie-Center for Neurology, University of Tübingen, 7Univ Rennes, Inserm, EHESP, Irset - UMR_S 1085, 35000
Objectives An increasing number of trials for novel disease-modifying therapies now focus on rare neurological diseases (RNDs), which often serve as forerunners for targeted mechanistic high-effect therapies (e.g., gene, RNA or small compound therapies). However, this group of diseases, while overall large (>9000 diseases), comes with particular statistical challenges for clinical trial design [1]. For example, randomised trials (RCT) can be used to detect disease-modifying effects, but are limited by the small sample size per RND and patient heterogeneity [2]. RCTs with smaller sample sizes can be considered but are limited by the asymptotic assumptions of standard statistical tests and modelling approaches. Alternatively, single arm trials can be considered [3], with smaller sample size down to one patient (n-of-1) [4]. Real world evidence can be used to inform the design and settings of trials [5], but there is a methodological need to investigate how it can be used to inform trial designs in rare diseases to test disease-modifying treatment effect. Our work then focuses on two main objectives: 1.Model the natural history of the disease by developing a Non-Linear Mixed Effect Model (NLMEM) describing disease progression of patients with an exemplary RND using patient registries, accounting for patient heterogeneity and missing covariates. 2.Develop a simulation framework to compare trials using the developed model and suitable inclusion criteria. We first compare parallel, crossover and delayed start RCT. We then investigate designs in smaller samples, focusing on single arm and n-of-1 trials. As a showcase RND, these developments will be demonstrated by the group of Autosomal Recessive Cerebellar Ataxias (ARCA) - a group of ultra-rare, progressive neurodegenerative disorders. They mainly affect the cerebellum but also induce other multi-systemic damage to other neurological systems, causing impairment to gait, balance, speech and fine motor movements [6]. Symptoms usually emerge during childhood or early adulthood, with more than 100 genotypes identified. ARCAs are prime candidates for targeted molecular therapies or gene therapies. While there is currently no approved disease-modifying treatment for most ARCAs, multiple clinical trials are ongoing, which aim to quantify the treatment effect of such therapies. Methods 1.Natural history of ARSACS We modelled the change in Scale for the Assessment and Rating of Ataxia (SARA) score versus Time Since Onset of symptoms (TSO) using NLMEM [7]. We used the population of 173 patients with an exemplary ARCA, namely “Autosomal Recessive Spastic Ataxia Charlevoix Saguenay” (ARSACS) included in the prospective real-world ARCA registry [8]. The Multivariate Imputation Chained Equation (MICE) [9] algorithm was used to impute missing covariates, and a covariate selection procedure with a pooled p-value accounted for the multiply imputed data sets [10]. 2.Simulation studies For RCTs, the previously developed model was modified to include a disease-modifying treatment effect slowing disease progression. Two-arm RCT were then simulated (500 replicates, 2 observations per year), assuming patients entered the trial within 30 years of symptom onset (TSO=0-30). Each scenario was simulated without drug effect (DE=0) and with DE=50%. We compared the power of parallel, crossover and delayed start designs, investigating several trial settings: trial duration (2 or 5 years); disease progression rate (a=0.11/0.22 yr-1); magnitude of residual error (s=2 or s=0.5); number of patients (100 or 40); method of statistical analysis (longitudinal analysis with non-linear or rich/sparse linear models). For n-of-1 trials, we used the previously developed model modified to accommodate a DE. Parameters were estimated on the PROSPAX dataset [11], comprised of 86 patients diagnosed with ARSACS. Patients were simulated with a TSO at inclusion to evaluate different levels of disease progression [12]: 20-30 years for a slow progression rate, 10-20 years for a fast progression rate. Each patient was simulated for 2.5 years without a DE and 2.5 years with a DE, with 4 observations per year (1000 replicates). Each patient was analysed using the previously developed model with an additional DE during the treated period, with a weak prior (mean=0, SD=5). The significance of the treatment effect was tested by evaluating the conditional mean and variance of the individual parameter of the model for each patient and using it for a Wald test on the DE. Scenarios were compared according to their type 1 error and corrected power. We investigated the influence of the disease progression rate (slow or fast), DE (0, 50 and 100%), and the magnitude of residual variability (s=1.84 or s=0.5). Results 1.Natural history of ARSACS A four-parameter logistic model that describes the SARA score versus TSO was found to best describe the data [7]. Men were estimated to have a lower SARA score at disease onset and a moderately higher maximum SARA score, and time to progression was estimated to be lower in patients with age of onset over 15 years. The population disease progression rate started slowly at 0.1 points per year peaking to a maximum of 0.8 points per year (between 25 and 35 years after onset of symptoms). 2.Simulation studies For RCTs, using NLMEM resulted in controlled type 1 error and higher power than with a rich or sparse linear mixed effect model, with powers respectively of 88%, 75% and 49% assuming a parallel design. Parallel and delayed start designs performed better than crossover designs. With slow disease progression and high residual error, longer durations are needed for power to be greater than 80%, 5 years for slower progression and 2 years for faster progression ataxias. For n-of-1 trials, all tested scenarios resulted in controlled type 1 error. With a slow progression rate, most trials had low power (<30%). With DE=100% and s=0.5, the corrected power went up to 70%. With a faster progression rate and s=1.84, trials had a corrected power of 12% (DE=50%) and 42% (DE=100%). Similar progression rate but with smaller residual error (s=0.5), the power increased to 60% (DE=50%) and 94% (DE=100%). Conclusion We first developed a NLMEM, based on real world evidence, that describes the natural history of ARSACS over time. We then used the developed model to compare different RCT settings and designs. Our simulations showed that delayed start designs are promising, as they are as powerful as parallel designs, but with the advantage that all patients are treated. We finally developed a method using the model developed on natural history data as a reference to quantify a disease-modifying effect for n-of-1 trials, where we found that, given the appropriate settings, it is possible to have a powerful test. This simulation framework relies on the model being well specified; extensions could consider more robust approaches, such as model averaging.