Estelle Chasseloup (1), Xinyi Li (1), Adrien Tessier (2), Mats O. Karlsson (1)
(1) Dept. of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden; (2) Pharmacometrics and Clinical Pharmacokinetics division, Institut de Recherches Internationales Servier, Suresnes, France
Objectives:
Non-linear mixed effect models (NLMEM) have been proven to be helpful during drug development to characterize drug effects and inform decisions[1][2][3]. However, model misspecification will typically have consequences: (i) a placebo model misspecification will result in biased drug effect estimate and inflated type I error, while (ii) drug model misspecification will lead to a biased drug effect model and a loss of power to detect the drug effect. Such perceived lack of robustness to model misspecification may thwart the use of model-based approaches.
The purpose of this work was to develop a new NLMEM approach called Individual Model Averaging (IMA), using mixture models, to overcome or attenuate these problems. The focus is balanced two-arm designs, but unbalanced designs and dose-response were also investigated.
Methods:
Approaches description
The standard NLMEM approach (STD) models and tests for drug effect using the likelihood ratio test (LRT) to discriminate between a reduced (H0: no drug model), and a full (H1: drug model for treated subjects only) model, where all subjects additionally have a placebo submodel. In IMA all subjects have, through a mixture feature, a probability of being described by the “drug” model. This probability is (H1: full), or is not (H0: reduced) dependant on the patient allocation, and the LRT is also used to accept H1 or H0.
The general equation for the mixture proportion is conditioned on the patient allocation (ARM=0 for not treated, ARM=1 for treated), as follows:
P=2*(1-r)*[(0.5+θ)*ARM + (0.5+(r-1)/r*θ)*(1-ARM)]
Where r is the placebo allocation ratio for ARM=0; and θ is constrained between ]-r2/2(r-1)2, r/(2(1-r))[ if r<0.5, and between ]-1/2, r/(2(1-r))[ if r >= 0.5 for H1 but fixed to 0 for H0. For a balanced two-arm design and H0, the probability to be described by the drug model is 0.5 for all patients. When θ is estimated at its upper bound in H1, the model coincides with the standard full model.
For the IMA approach the typical drug effect was obtained as 2θ(1-r)/r*TD, TD being the typical drug effect, to account for the probability to be allocated to the submodel with drug effect and the placebo allocation ratio.
Data
Three placebo data sets were used:
1. ADAS-cog (Alzheimer’s Disease Assessment Scale-Cognitive) scale ranging between 0 and 70 and treated as continuous data, from 800 patients followed-up between 2-3 years for 4-5 observations.[4]
2. Likert-pain score data (11 points scale) from 230 patients with diabetic neuropathy, followed-up for 4 months with weekly records.[5][6]
3. Daily seizure counts data from 500 patients with refractory partial seizures followed-up for 12 weeks, maintained under their standard epileptic treatment.[7]
When exploring STD and IMA properties in the presence of a drug effect, the ADAS-cog data were modified adding simulated values from of a time-dependent exponential model with 30% between-subject variability, to the observed data. To explore their properties in presence of dose-response, an Emax model was added on top to simulate studies 1:1:1:1 with three treated arms (20%, 40%, and 80% of a maximum effect of 10 points) and a placebo arm.
Comparison
Using the real data sets with repeated randomisations of the design features [N=1000], IMA and STD were compared in terms of type I error rate (at ??=0.05) and bias in the estimated drug effect using a scenario of no drug effect in data. This applied to both two arm comparisons (balanced or unbalanced), and dose-response analyses. For STD, it was additionally investigated whether “standard” model averaging [8] could improve the results.
The ADAS-cog data modified with drug effect or dose-response, through addition of a drug effect, were used to compare the IMA and STD methods in terms of bias in the estimated drug effect using a scenario with drug effect in the data, and power [9][10].
The robustness of the methods towards model misspecification was tested with various combinations of common placebo and drug models, with or without inter-individual variability.
Results:
For each data set, the best (in terms of objective function value) of the placebo models tried had reasonable goodness-of-fit plots (PRED vs DV, IPRED vs DV, |iWRES| vs IPRED, and CWRES vs time).
Data without drug effect
Using the placebo data sets modified to mimic randomized trials without drug effect, the STD type I error was inflated (median values across all scenarios were 26%, 97% and 45% for ADAS-cog, pain-score and seizures counts data) and for the majority of scenarios associated with considerable bias in the drug effect. In contrast, IMA showed controlled type I error (3.5%, 5.0% and 5.0%) and unbiased drug estimates regardless of the placebo-drug model combination tried. Regarding dose response and unbalanced designs, similar trends were observed: uncontrolled type I error rates and bias in drug effect for STD vs controlled type I error and no bias in drug effect for IMA.
For STD with model averaging across both placebo and drug models, even though no drug effect was present, a biased, and significant, treatment effect would be concluded for all the three data sets.
Data with drug effect
When using the ADAS-cog data modified by the addition of a drug effect, IMA had higher power than STD, as the STD power was pulled down by the empirical cut-off used to correct for its inflated type I error.
When using the ADAS-cog data modified by the addition of a dose-response (typical values of 1.7,3.5, and 6.9 points at tlast), IMA had no appreciable bias in the drug estimates (typical values of 1.4, 3.0, and 6.9) contrary to STD (typical values of 0.3, 0.6, and 1.3), but both had a high power (>95%).
Conclusions:
The standard method appears to have flaws when trying to estimate the drug effect with NLMEM and real data. Real patient time-course data often have complex trajectories. Even if the main pattern of the time-course is taken into account, there is typically additional features in real data. With STD, any feature of the data not described by the placebo model is likely to make a new model feature significant, leading to bias in drug effects and inflated type I errors rates when the drug models provide a new degree of freedom to describe the data.
The individual model averaging approach do not suffer from this issue since the placebo and the drug model are fitted together to the whole data set, both in the reduced and the full model, which creates a robust approach towards model misspecification.
References:
[1]Marshall S, Madabushi R, Manolis E, et al (2019) Model-Informed Drug Discovery and Development: Current Industry Good Practice and Regulatory Expectations and Future Perspectives. CPT Pharmacometrics Syst Pharmacol 8:87–96. https://doi.org/10.1002/psp4.12372
[2]Lalonde RL, Kowalski KG, Hutmacher MM, et al (2007) Model-based Drug Development. Clinical Pharmacology & Therapeutics 82:21–32. https://doi.org/10.1038/sj.clpt.6100235
[3]Milligan PA, Brown MJ, Marchant B, et al (2013) Model-Based Drug Development: A Rational Approach to Efficiently Accelerate Drug Development. Clinical Pharmacology & Therapeutics 93:502–514. https://doi.org/10.1038/clpt.2013.54
[4]Ito K, Corrigan B, Zhao Q, et al (2011) Disease progression model for cognitive deterioration from Alzheimer’s Disease Neuroimaging Initiative database. Alzheimers Dement 7:151–160. https://doi.org/10.1016/j.jalz.2010.03.018
[5]Plan EL, Elshoff J-P, Stockis A, et al (2012) Likert pain score modeling: a Markov integer model and an autoregressive continuous model. Clin Pharmacol Ther 91:820–828. https://doi.org/10.1038/clpt.2011.301
[6]Schindler E, Karlsson MO (2017) A Minimal Continuous-Time Markov Pharmacometric Model. AAPS J 19:1424–1435. https://doi.org/10.1208/s12248-017-0109-1
[7]Trocóniz IF, Plan EL, Miller R, Karlsson MO (2009) Modelling overdispersion and Markovian features in count data. J Pharmacokinet Pharmacodyn 36:461–477. https://doi.org/10.1007/s10928-009-9131-y
[8]Aoki Y, Röshammar D, Hamrén B, Hooker AC (2017) Model selection and averaging of nonlinear mixed-effect models for robust phase III dose selection. J Pharmacokinet Pharmacodyn 44:581–597. https://doi.org/10.1007/s10928-017-9550-0
[9]Vong C, Bergstrand M, Nyberg J, Karlsson MO (2012) Rapid sample size calculations for a defined likelihood ratio test-based power in mixed-effects models. AAPS J 14:176–186. https://doi.org/10.1208/s12248-012-9327-8
[10]Ueckert S, Karlsson MO, Hooker AC (2016) Accelerating Monte Carlo power studies through parametric power estimation. J Pharmacokinet Pharmacodyn 43:223–234. https://doi.org/10.1007/s10928-016-9468-y
Reference: PAGE () Abstr 9541 [www.page-meeting.org/?abstract=9541]
Poster: Oral: Methodology - New Tools