New tests for trials evaluating disease-modifying treatment in very few patients using longitudinal data in Autosomal Recessive Cerebellar Ataxias - PAGE Meeting (Population Approach Group Europe)

Niels Hendrickx ¹, France Mentré ¹, Mats Karlsson ², Andrew Hooker ², Andreas Traschütz ^3,4, Rebecca Schüle ^5,6, PROSPAX consortium, EVIDENCE-RND Consortium, Matthis Synofzik ^3,4, Emmanuelle Comets ^1,7

1 Université Paris Cité et Université Sorbonne Paris Nord, Inserm, IAME (Paris, France), 2 Department of Pharmacy, Pharmacometrics Research Group, Uppsala University, (Uppsala, Sweden), 3 Division Translational Genomics of Neurodegenerative Diseases, Hertie Institute for Clinical Brain Research (HIH), University of Tübingen (Tübingen, Germany), 4 German Center for Neurodegenerative Diseases, (DZNE) (Tübingen, Germany), 5 Department of Neurology, Division of Neurodegenerative Diseases and Movement Disorders, Heidelberg University Hospital and Faculty of Medicine (Heidelberg, Germany), 6 Center for Neurology and Hertie Institute for Clinical Brain Research, University of Tübingen (Tübingen, Germany), 7 Univ Rennes, Inserm, EHESP, Irset - UMR S 1085 (Rennes, France)

Objectives:

Autosomal Recessive Cerebellar Ataxias (ARCAs) are ultra-rare, progressive neurodegenerative disorders that primarily affect the cerebellum but also cause multi-systemic involvement. Clinical manifestations include gait and balance disturbances, dysarthria, and impaired fine motor coordination (1). Although no disease-modifying therapies are currently approved for most ARCAs, they are strong candidates for targeted molecular interventions (2). However, conventional randomized controlled trial (RCT) designs are not feasible due to small sample size and heterogeneity. Instead, evaluation often relies on trials of very few patients (3).
These challenges are further compounded by the specific difficulties of conducting trials in ARCAs (4). Due to the absence of a universal biomarker, disease severity must be measured using clinical composite scales such as the Scale for the Assessment and Rating of Ataxia (SARA) (5). In addition, conventional N-of-1 crossover designs are unsuitable in this context, as the therapies under investigation are intended to be long-lasting rather than purely symptomatic.
To address these limitations, the FDA has recommended leveraging natural history cohorts and patient registries to guide trial design, refine inclusion criteria, and define primary outcomes (6,7). In this work, we propose two tests for detecting disease-modifying treatment effects in trials of very few patients, integrating natural history data (NHD).
Methods:
We propose two new tests that evaluate an individual Drug Effect (DE) based on NHD and Non-Linear Mixed Effect Models (NLMEM), in two-period, delayed onset trials. Patients are untreated during the first period and start treatment at the beginning of the second period. The first test uses a Bayesian estimation that relies on priors calibrated with an NLMEM estimated on NHD. The DE is then tested using the conditional distribution of the DE (CDDE). The second test is an observation-based approach using machine learning, relying on outlier detection using Pareto Depth Analysis (PDA) (8). This method extracts features from the NHD that are expected to discriminate between treated and untreated patients. Here, we performed a piece-wise linear regression to estimate the slopes of SARA score during each trial period. Pareto Fronts, representing the typical distances between the features, are built in the NHD, and a depth is defined relative to these fronts in the training set. A test is then performed comparing the depth of one or several individuals to the distribution of depths in the training set.
The two proposed tests were evaluated with a simulation study to quantify the type 1 error and power in trials of very few patients, based on a 4-parameter logistic NLMEM (4) estimated on the PROSPAX natural history study (NCT04297891). Trials of two periods of 2.5 years (4 observations/year) were considered with different simulation scenarios: slower/faster disease progression, higher/lower residual error, number of subjects (1 or 5), high or low magnitude of DE. To implement the PDA method, a training set of 500 patients was simulated using the NLMEM. For the CDDE method, a sensitivity analysis was formed to quantify the impact of model misspecification.
Results:
The CDDE method provided controlled type I error. In N-of-1 trials, the power was low for DE=0.5 (<60%). For DE=1, the power was low for high residual error (<50%) and high for low residual error (>65%). For N-of-5 trials, there is an increase in power for all scenarios. Notably, for DE=1, it had more than 85% power for fast progression ataxias. However, sensitivity analyses revealed vulnerability to misspecification. The PDA method showed lower power in all scenarios, especially with high residual error. The power was, high for the most optimistic scenarios, especially with a low magnitude of residual error (>70%).

Conclusions:
In this work, we proposed two new tests to evaluate a disease-modifying treatment effect in trials of very few patients (down to one). The proposed methods leveraged information from natural history data. Our analysis showed that, given suitable settings, N-of-1 trials, but preferably N-of-5 trials could be feasible with the proposed methods. However, our approaches did not consider a potential placebo effect, that could influence the estimation of the DE in practice. Furthermore, the PDA approach was meant to be applied to observed data, but due to sparse data, features were extracted from data simulated under a NLMEM model. The sensitivity analysis also revealed that the CDDE test was sensitive to model misspecification.

References:
References:
1. Synofzik M et al. Autosomal Recessive Cerebellar Ataxias: Paving the Way toward Targeted Molecular Therapies. Neuron. 2019 Feb 20;101(4):560–83.
2. Schüle R et al. Tailored antisense oligonucleotides for ultrarare CNS diseases: An experience-based best practice framework for individual patient evaluation. Mol Ther Nucleic Acids. 2025 Sep 9;36(3).
3. Jonker AH et al. The state-of-the-art of N-of-1 therapies and the IRDiRC N-of-1 development roadmap. Nat Rev Drug Discov. 2025 Jan;24(1):40–56.
4. Hendrickx N et al. Comparing randomized trial designs to estimate treatment effect in rare diseases with longitudinal models: a simulation study showcased by Autosomal Recessive Cerebellar Ataxias using the SARA score. BMC Med Res Methodol. 2025 Jul 30;25(1):179.
5. Schmitz-Hübsch T et al. Scale for the assessment and rating of ataxia: Development of a new clinical scale. Neurology. 2006 Jun 13;66(11):1717–20.
6. FDA. Rare Diseases: Natural History Studies for Drug Development [Internet]. 2020.
7. FDA. Considerations for the Use of Real-World Data and Real-World Evidence To Support Regulatory Decision-Making for Drug and Biological Products [Internet]. 2023.
8. Hsiao KJ et al. Multicriteria Similarity-Based Anomaly Detection Using Pareto Depth Analysis. IEEE Trans Neural Netw Learn Syst. 2016 Jun;27(6):1307–21.

Acknowledgements:

This work was supported by members of the Evidence-RND consortium, which includes Alzahra Hamdan, Xiaomei Chen, Nicole Maria Heussen, Ralf-Dieter Hilgers, Thomas Klockgether, Yevgen Ryeznik, Oleksandr Sverdlov. This work was funded by the European Joint Programme on Rare Diseases (EJP RD) Joint Transnational Call 2019 for the EJP RD WP20 Innovation Statistics consortium “EVIDENCE-RND” focusing on “Innovative Statistical Methodologies to Improve Rare Diseases Clinical Trials in Limited Populations” under the EJP RD Grant Agreement (n°825575) (to M.K, R.S. and M.S.); as well as by the European Union, project European Rare Disease Research Alliance (ERDERA), GA n°101156595, funded under call HORIZON-HLTH-2023-DISEASE-07 (to M.S. R.S, and F.M.).

Reference: PAGE 34 (2026) Abstr 12226 [www.page-meeting.org/?abstract=12226]

Poster: Oral: Methodology - New Modelling Approaches