III-039

Understanding the numerical issues in and instability of fits based on the laplace approximation

Patrick Kofod Mogensen1, Andreas Noack2

1PumasAI, 2PumasAI

Introduction The application of the Laplace Approximation to the Marginal Log-Likelihood function has enabled researchers to analyse many statistically realistic non-linear mixed effects models without repeatedly carrying out expensive numerical integration of random effects. In pharmacometrics, [1] is a main ressource for details about the derivations and the method as implemented in NONMEM, though software also exists for more general modeling and estimation such as [2]. It is well-known in practice, and sparsely mentioned in the literature, that the Laplace Approximation can have “numerical issues” or “fail to converge” in some cases. The root cause has not been thoroughly analyzed in either the pharmacometrics literature or the statistical literature more broadly but is rather taken as a numerical curiosity. This is unfortunate, as the Laplace Approximation is applicable to a very broad class of models and is therefore a very useful tool to study realistic statistical models. Even if FOCE has been extended to statistical models beyond the Gaussian cases in [3] and related work, Laplace remains more general and in that sense more powerful. Objectives To remedy the issues that cause diverging gradients of the Laplace Approximation it may be possible to further regularize the method beyond the regularization that is built-in through the prior of the random effects. Further regularization will introduce bias but if the bias is sufficiently small it may be acceptable if it enables the researcher to model important parts of the statistical properties such as censoring of output. In any case, it is important to know exactly what the problem is before we can study any solutions. This work adds to this by setting up the following objectives: – To study the issues that are often reported about failed Laplace Approximation fits in simulations and relate it to the mathematical properties of the approximation – To study the mathematical equations related to the Laplace Approximation for a specific model and sampling scheme to get insights into the root cause of the failure – To identify key properties of the above mentioned components that cause the fits to fail in order to design more robust algorithms in future work Methods In this work, we use simulations and theory to show that the problem is not a property of complicated model structure such as censored or truncated data models. Indeed, we study a simple model with one compartment, a single IV injection, one gaussian random effect (on clearance), and with a proportional error model. Simulations are carried out using the Pumas software [4], but all conclusions and analyses are generally applicable to software that uses Laplace’s Approximation. Using this simple setup we construct a case where the optimization breaks down in the sense that the objective function keeps decreasing while the gradient elements diverge towards infinity. We use the case to highlight the issues that can be seen more generally in the analytical expressions for the various terms that enter into the Laplace Approximation to the marginal likelihood. Results Constructing a “failed fit with diverging gradients” can easily be achieved when simulating the model described above. This means that we do not need to use a case with censored data although that is a case where the issue often comes up. This simplifies the further analysis. When considering the failed cases it is often suggested to restart the optimization at a different starting point to circumvent the issues studied here, but we find that the new, “stable” optima are actually often characterized by lower log-likelihood values so they cannot simply be preferred over the candidate that leads to diverging gradients. Instead we need to understand what goes wrong. The Hessian with respect to the random effects of the joint log-likelihood is at the heart of the problem. In some circumstances, the Hessian becomes (close to) singular as the fixed effects (population parameters) move along a narrow path with large gradients. This creates the problem that is often observed because the Hessian evaluated at the empirical bases estimates takes a central role not only in the interior optimization but also in the Laplace Approximation itself where it enters into a log-determinant expression. This links the (near) singular matrix with the diverging gradients. We see that the objective function in the outer optimization problem slowly decreases, but the gradient elements diverge towards infinity and convergence is never achieved. Conclusion In this work we characterize the problems that occur when Laplace Approximation based fits run into what is often simply described as “numerical issues”. Rather than putting it in a black box of things we don’t really understand but encounter from time to time, we use specific models and examples to study the nature of the failure. This leads to better a understanding of how to avoid the issues for the model we plan to use for a given study, and hopefully inspires future work to make the method more robust to the issue of a (near) Singular Hessian. The work does not explicitly formulate any universal solution to the problem of diverging gradients, but future work may use this work as a point of departure when suggesting regularization schemes or other solutions that can solve the issues keeping many researchers from formulating their statistical models in a way that is faithful to the study protocol and properties of the sampled data.

 [1] Wang, Y. Derivation of various NONMEM estimation methods. J Pharmacokinet Pharmacodyn 34, 575–593 (2007). https://doi.org/10.1007/s10928-007-9060-6   [2] Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H., & Bell, B. M. (2016). TMB: Automatic Differentiation and Laplace Approximation. Journal of Statistical Software, 70(5), 1–21. https://doi.org/10.18637/jss.v070.i05   [3] Rackauckas, C., Ma, Y., Noack, A., Dixit, V., Mogensen, P. K., Byrne, S., … & Ivaturi, V. (2020). Accelerated predictive healthcare analytics with pumas, a high performance pharmaceutical modeling and simulation platform. BioRxiv, 2020-11.   [4] Noack, A., Mogensen, P., Nyberg, J., Ivaturi, V.,  (2021) Generalized FOCE with Pumas. PAGE 29 Abstr 9734 [www.page-meeting.org/?abstract=9734] 

Reference: PAGE 33 (2025) Abstr 11580 [www.page-meeting.org/?abstract=11580]

Poster: Methodology - Estimation Methods

PDF poster / presentation (click to open)