Evaluation of Stepwise Covariate Model Building Combined with Cross-Validation
Takayuki Katsube (1, 2), Akash Khandelwal (1), Kajsa Harling (1), Andrew C Hooker (1), Mats O Karlsson (1)
(1) Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden; (2) Clinical Research Department, Shionogi & Co., Ltd., Japan
Introduction: Covariate models are often built using a stepwise covariate model building (SCM), a procedure which is not intrinsically designed for providing good predictive performance. Cross-validation (XV) is a procedure for estimating the prediction error using multiple subsets of a dataset and may be used to select an appropriate model size . If the main goal is predictive modeling, SCM combined with XV for determining model size may be useful.
Objectives: The objective of this study is to evaluate covariate model building using SCM combined with XV.
Methods: Five-fold XV was used in this study. The dataset was randomly split into 5 parts with approximately equal number of subjects. Each part and the remaining 4 parts were used as test data and training data, respectively. Using each training data, covariate models were built using SCM with or without linearization (a further development of  using the FOCE approximation). At each step in the SCM, the objective function values on the corresponding test data (XV OFV) were calculated without re-estimating (MAXEVAL=0 in NONMEM), to evaluate the predictive performance. The datasets were randomly split 3 times. Consequently, the sum of XV OFV on 15 test data sets was calculated by the number of relations. The number of relations where the sum of XV OFV was minimal was taken to be an appropriate model size. Pharmacokinetic datasets for phenobarbital (4 test relations), moxonidine (13 test relations) and pefloxacin (14 test relations) were used to evaluate the procedure.
Results: The sums of XV OFV were minimal at 2 relations for phenobarbital and moxonidine, while for pefloxacin, the minimum of sum of XV OFV was at the maximal number of relations. The results in terms of optimal model size were the same for SCM and linearized SCM. In that respect, there were larger differences in prospective OFV between random splits within a method, than between the linearization or not. The optimal number of relations predicted by SCM combined with XV was the same (phenobarbital and moxonidine) or larger (pefloxacin) than when using standard SCM (forward addition (p<0.05) and backward deletion (p<0.01)).
Conclusions: These results suggest the possibility of covariate model building using SCM combined with XV. Using XV to determine suitable model size is expected to give better predictive model performance. Using the linearized SCM speeds the process up and makes SCM combined with XV feasible for real world problems.
 Breiman L, Spector P. Int Stat Rev. 1992. 60: 291-319.
 Jonsson EN, Karlsson MO. Pharm Res. 1998. 15:1463-1468.