Implicit and efficient handling of missing covariate information using full random effects modelling
Joakim Nyberg1, Mats O. Karlsson1,2 and E. Niclas Jonsson1
1. Pharmetheus, 2. Department of Pharmaceutical Biosciences, Uppsala University
Introduction: Covariates are observable predictors that are included in models to reduce the unexplained variability. The identification and estimation of the coefficients for covariates can be done in many different ways but a common challenge with all methods is how to handle missing covariates. There are many ways to handle missing covariates and the choice of method depends on if the data are missing at random or not . A common approach within the field of population pharmacokinetics and pharmacodynamics is to use median imputation.
In the present paper we will compare the ability to handle missing covariate information of a new full model covariate estimation method - the full random effect model (FREM) approach . In this method the covariates are treated as observed data points and are modelled as random effects instead of being treated as error free explanatory variables. Since missing covariates are handled as any missing dependent variable, i.e. simply just not included in the analysis, FREM implicitly handles missingness. However, the analysis is still implicitly informed by the missing covariate information through the correlation to other covariates and dependent variables as the variance-covariance matrix of all parameters and covariates are estimated for the whole population.
The FREM approach to handle missing covariate information is compared to the more traditional full fixed effects modelling (FFEM)  approach with median imputation. In the FFEM approach, the covariates are treated as independent, error free predictors, which are associated to the model parameters through estimated fixed effects parameters.
Objectives: To investigate missing covariate data properties with the FREM approach compared to the traditional FFEM approach with median imputation.
Methods: A previously developed model  (bi-exponential model with drifting baseline, parameterized with 6 parameters; BASE, BASESL, BP, HLKOFF, HLKON and PLMAX with SEX, birth length (BL) and birth weight (BW) as covariates on all parameters, to describe the height-for-age Z-score in children (0-15 years) in low and middle income countries, was used to investigate estimation properties with different missing covariate patterns. Observed covariates and realized design (~8 samples/per child) from an Indian cohort (n=6626) was used to sample individuals with complete covariate information. The FFEM model was used to simulate 1000 datasets with 1000 children in each dataset with different levels of missing covariate information at random; 0%, 10%, 30%, 70% and 90%. A FREM model and a FFEM model were used to re-estimate the model given the simulated datasets, and bias and precision were computed and visualized. Missing covariates were imputed using the median value in the FFEM models. With FREM, missing covariates were treated as missing data and handled implicitly.
Results: Overall, the FREM approach did not exhibit any bias in the estimated covariate coefficients, even with 90% missing covariate information. In contrast, median imputation using FFEM resulted in increasingly biased coefficients with increasing degree of missing information. For example, the sd normalized coefficient bias for BW on BASE with FFEM and FREM, at 90% missing covariate information, were 3.47 versus 0.01 respectively. Precision was affected (decreasing precision with increasing percentage missing) for both for FFEM and FREM but with more pronounced imprecision with the FFEM approach. In, SEX on HLKOFF, for example, the sd of the normalized coefficient at 10% missing covariate information were 0.046 with FREM against 0.169 with FFEM.
Conclusions: The FREM approach handled the missing covariate information more efficiently than the FFEM approach with median imputation, making it a promising tool for situations with large amount of missing covariate data, for example when bridging between studies from different development phases.
 Å. M. Johansson and M. O. Karlsson. Comparison of Methods for Handling Missing Covariate Data. AAPS J. 2013 Oct; 15(4): 1232–1241.
 M. O. Karlsson A full model approach based on the covariance matrix of parameters and covariates. PAGE 21 (2012) Abstr 2455 [www.page-meeting.org/?abstract=2455].
 M. Gastonguay. Full Covariate Models as an Alternative to Methods Relying on Statistical Significance for Inferences about Covariate Effects: A Review of Methodology and 42 Case Studies. PAGE 20 (2011) Abstr 2229 [www.page-meeting.org/?abstract=2229].
 E. N. Jonsson, J. Nyberg, J Häggström, Lifecycle Auxology & Neurocognitive Development team, representing the Healthy Birth, Growth and Development knowledge integration (HBGDki) community. Determinants for physical growth patterns in low- and middle-income countries. PAGE 25 (2016) Abstr 5844 [www.page-meeting.org/?abstract=5844].