**Modelling Techniques Handling Dynamic Pain Scores Characteristics**

Elodie Plan (1), Ron Keizer (1), Jan-Peer Elshoff (2), Armel Stockis (3), Laura Sargentini-Maier (3, 4), Mats Karlsson (1)

(1) Department of Pharmaceutical Biosciences, Uppsala University, Sweden; (2) Global Biostatistics, UCB Biosciences GmbH, Monheim, Germany; (3) Global Exploratory Development, UCB Pharma SA, Braine-l’Alleud, Belgium; (4) Current affiliation: Department of M&S, Ablynx NV, Zwijnaarde, Belgium.

**Objectives**

Pain scales defined by the NIH Pain Consortium [1] include the Numeric Rating Scale, also known as Likert Scale. In this 11-point measurement instrument, the lowest score 0 corresponds to no pain and the highest score 10 to the worst possible pain. Non-linear mixed effects modelling has demonstrated high potential to treat data close to their true nature. The main characteristic of scores is interval-constraints, as there are a finite number of ordered categories. Secondly, like many symptoms, they follow a time-course. Finally, a usually overlooked characteristic of frequently assessed scores is serial correlation between observations.

The objectives of this study were to explore pain scores characteristics, as well as to develop platform models and modelling techniques adapted to fit real data.

**Methods**

*1. Clinical trial*

Data from the placebo arms of three Phase 3 multi-centre, randomized, double-blind clinical trials were considered. Screened patients were diagnosed with diabetes mellitus and showed symptoms of painful distal diabetic neuropathy. The primary variable was the overall pain intensity self-rated with a Likert scale provided to each subject as part of a diary and completed on a daily basis.

*2. Simulation study*

100 stochastic simulations were generated with a baseline response ordered categorical model, whose parameter values were derived from a fit to the real data. A second set of 100 simulated datasets included an empirical linear drug effect on the logit of the categorical part, and the design was changed to four parallel dose arms.

* 2.1. Structural models handling interval-constraints*Simulated data were analysed in NONMEM VI [2] with an ordered categorical model and two alternative models: a truncated generalized Poisson model [3] and a logit-transformed continuous model.

*3. Data analysis*

Real data were analysed in NONMEM 7 with three competing models: one truncated generalized Poisson model and two logit-transformed continuous models.

* 3.1. Model components handling time-course*The population mean score time-course can be characterized through λ in the generalized Poisson model or through the logit of IPREDs in the two continuous models. A non-linear decrease attributable to a placebo effect was described with functions restricting scores to remain between 0 and 10.

* 3.2. Model components handling serial correlation*A Markov process [4, 5] was combined with the generalized Poisson model relaxing the between-observation independence assumption. This discrete-time process consisted of first-order components formulated to inflate the dependence between the present score and the transition magnitude from the preceding score.

Correlated errors [5, 6] were introduced in the first continuous model with an autoregressive time series (AR(1)). It described a continuous-time correlation between two subsequent errors, which exponentially decreased during the time-interval between two observations.

A stochastic process [7, 8] was implemented in the second continuous model with Stochastic Differential Equations (SDEs). Drift from individual model predictions was incorporated in the system as a standard Wiener process, whose variance increases linearly in time.

*4. Model diagnostics*

Model evaluation was carried out through newly developed simulation-based diagnostics, adapted VPCs [9], and, for continuous models, also through residual-type diagnostics, based on CWRES [10].

**Results**

*1. Clinical trial*

A total of 231 neuropathic patients were randomized in placebo arms. They provided 22,492 pain measurements during 18weeks. All possible scores were present in the raw data. The frequency average signified that central scores were more represented than the tail scores.

*2. Simulation study*

* 2.1. Structural models handling interval-constraints*The ordered categorical model included twice as many parameters as the two alternative models: a truncated generalized Poisson model and a logit-transformed continuous model. They all adequately fitted the simulated data. Resimulations after estimation produced proportions of scores in agreement with originally simulated distributions. The statistical power to detect a drug effect was high for all models.

*3. Data analysis*

* 3.1. Model components handling time-course*The baseline pain score was estimated between 6.1 and 6.2 by the three competing models: one truncated generalized Poisson model and two logit-transformed continuous models. An exponential decay affecting the baseline was evidenced in all models. The characterised maximum placebo effect and its half-life were also similar, around 20% and 30days, respectively. Parameter precision was reasonable for all models.

* 3.2. Model components handling serial correlation*The Markov process (ΔOFV»11,000; df=13) combined with the generalized Poisson inflated probabilities of absolute transition values 0, 1, 2, and 3. The probability of null transitions was modelled with a time-dependency and was 55% at its maximum.

Correlated errors (ΔOFV»2,000; df=1) introduced in the first continuous model had a standard deviation of 1.8. Their autocorrelation exponentially decreased with time and was 47% after one day.

The stochastic process (ΔOFV»1,800; df=2) implemented in the second continuous model made autocorrelation in residuals disappear and IIVs decrease. The variance of the scaling diffusion term was estimated to 0.038 score^{2}/day on the logit scale.

*4. Model diagnostics*

Exploration of specific between-score transitions in terms of frequency or time-course was found useful and transferred to the VPC technique. Simulations of individuals were more realistic after introduction of correlation components. Examination of consecutive residuals was used for inspection of the ability of AR(1) and SDEs to handle autocorrelation in residuals.

**Conclusions**

Truncated generalized Poisson and logit-transformed continuous models are able to handle interval-constraints of Likert scales. Time-course functions of mean scores can be implemented similarly into the models. In this work, three alternative approaches, which are all new in pain modelling, are proposed to address serial correlation between measurements.

All processes could be implemented in NONMEM and the estimation methods FOCE and LAPLACE were found appropriate, as previously shown [11, 12]. The Markov and the SDE models ran in 1h, whereas the AR(1) model took 1month. Therefore, among the two nested continuous models, SDE led to a slightly higher OFV with 1df more than AR(1), but was considerably faster and with no appreciable difference in the simulation-based diagnostics.

Likert pain scores are difficult to model but important clinical endpoints. This work points to three new models handling these, and presents model diagnostics facilitating model inspection and development.

**References**

[1] McCaffery, M., & Beebe, A., *Pain: Clinical Manual for Nursing Practice*. 1993, Baltimore: V.V. Mosby Company.

[2] Beal, S., Sheiner, L.B., Boeckmann, A., & Bauer, R.J., *NONMEM User's Guides*. 1989-2009, Ellicott City, MD, USA: Icon Development Solutions.

[3] Gschlossl, S. and Czado, C., *Modelling count data with overdispersion and spatial effects.* Statistical Papers, 2008. **49**(3): p. 531-552.

[4] Troconiz, I.F., Plan, E.L., Miller, R., and Karlsson, M.O., *Modelling overdispersion and Markovian features in count data.* J Pharmacokinet Pharmacodyn, 2009. **36**(5): p. 461-77.

[5] Silber, H.E., Kjellsson, M.C., and Karlsson, M.O., *The impact of misspecification of residual error or correlation structure on the type I error rate for covariate inclusion.* J Pharmacokinet Pharmacodyn, 2009. **36**(1): p. 81-99.

[6] Karlsson, M.O., Beal, S.L., and Sheiner, L.B., *Three new residual error models for population PK/PD analyses.* J Pharmacokinet Biopharm, 1995. **23**(6): p. 651-72.

[7] Overgaard, R.V., Jonsson, N., Tornoe, C.W., and Madsen, H., *Non-linear mixed-effects models with stochastic differential equations: implementation of an estimation algorithm.* J Pharmacokinet Pharmacodyn, 2005. **32**(1): p. 85-107.

[8] Tornoe, C.W., Overgaard, R.V., Agerso, H., Nielsen, H.A., Madsen, H., and Jonsson, E.N., *Stochastic differential equations in NONMEM: implementation, application, and comparison with ordinary differential equations.* Pharm Res, 2005. **22**(8): p. 1247-58.

[9] Karlsson, M.O. and Holford, N.H.G., *A Tutorial on Visual Predictive Checks.*, in*Population Approach Group in Europe, PAGE 17*. 2008, Abstr 1434: Marseille, France.

[10] Hooker, A.C., Staatz, C.E., and Karlsson, M.O., *Conditional weighted residuals (CWRES): a model diagnostic for the FOCE method.* Pharm Res, 2007. **24**(12): p. 2187-97.

[11] Plan, E.L., Maloney, A., Mentré, F., Karlsson, M.O., and Bertrand, J., *Nonlinear Mixed Effects Estimation Algorithms: A Performance Comparison for Continuous Pharmacodynamic Population Models*, in *Population Approach Group in Europe, PAGE 19*. 2010, Abstr 1880: Berlin, Germany.

[12] Plan, E.L., Maloney, A., Troconiz, I.F., and Karlsson, M.O., *Performance in population models for count data, part I: maximum likelihood approximations.* J Pharmacokinet Pharmacodyn, 2009. **36**(4): p. 353-66.