Trajectory modelling based on early childhood longitudinal growth data
Louise Ryan (1), Craig Anderson (1), Stef Van Buuren (2) and various members of the HBGDki Team
(1) University of Technology Sydney, and Australian Research Council Centre of Excellence in Mathematical and Statistical Frontiers; (2) Netherlands Organization for Applied Scientific Research, Leiden, the Netherlands
Objectives: Understand the pros and cons of various approaches to longitudinal growth modelling. Understand how various measures reflecting different growth patterns can be extracted from these models and used to predict other outcomes of interest.
Overview/Description of presentation: We will describe a unique database gathered through the efforts of the Bill and Melinda Gates Foundation that includes twenty nine longitudinal studies of child growth and development. The studies range in size from 197 children, to over 1 million children, with the median number of 764 children per study. The studies varied considerably in terms of the number of repeated measurements per child, with the smallest being an average of 6 measurements per child and the median being an average of 13 measurements per child. Available growth measurements include child weight as well as length or height, depending on the child’s age.
While one could work directly with these measurements, we chose to model age and gender adjusted Z-scores for height (HAZ) and weight WAZ), computed relative to the WHO child growth standards (see http://www.who.int/childgrowth/standards/en). The advantage of working with HAZ and WAZ rather than raw height and weight values is that models do not need to account for gender and age.
After presenting various modelling options, including random effects models based on splines and functional data analysis methods, we will assess how well the various models do in capturing the patterns represented in the studies. We assess model fit through mean square error computed by comparing observed and predicted values for a hold-out sample. We show that a linear spline model (the so called broken stick model) and a recently developed functional data analysis approach  do best overall. Application of the broken stick methodology requires specification of the number and location of knots. We present some sensitivity analysis that explores the impact of varying the number and location of knots, finding that in our setting, the method is relatively robust. Surprisingly, the broken stick model consistently outperforms a penalized spline approach which has a tendency to over-smooth the data. The broken stick model has some computational advantage relative to the functional data analysis method, with the latter being slower to fit in general and sometimes encountering issues with convergence. This particular implementation of functional data analysis is designed to work in sparse data settings where observations are available at only a relatively small number of time points.
We discuss how child growth trajectories can be represented by derivatives computed from the fitted models. We show that numerical derivatives do just as well as analytical derivatives in capturing changing curve shapes. However, we do find that the estimated derivative is quite sensitive to the curve fitting method that was used. We show how estimated derivatives relate to classical measures of child growth, including growth velocity and conditional Standard Deviation Scores (SDS), both of which have been widely advocated in the child growth literature . We illustrate how these measures can be used as predictors for subsequent outcomes of interest, for example, cognition. We conclude by discussing some of the challenges faced when analysing complex longitudinal growth data and outline areas where further research and exploration would be useful.
Conclusions/Take home message: Child growth data can be well captured using a linear spline model that includes random effects to allow individual childrens’ departures from overall population curves. More sophisticated methods based on functional data analysis methodologies also do very well, but are computationally more challenging.
 Xiao, L., Ruppert, D., Zipunnikov, V., and Crainiceanu, C. (2014+), Fast covariance estimation for high dimensional functional data, Statistics and Computing, to appear.
 Cole, TJ (1995). Conditional reference charts to assess weight gain in British infants. Archives of Disease in Childhood 73: 8-16.