III-088

MULTIDIMENSIONAL SCALING FOR LONGITUDINAL DATA EMBEDDINGS IN PHARMACOMETRICS

Mohamed Mohamed 1, Lucas Pereira 1

1 PumasAI (Dover, United States)

Introduction/Objectives: Longitudinal data in pharmacometrics typically involves multiple time-varying inputs and outputs for each subject in a population. Each subject can have a different number of observations at different time points, leading to irregular data structures that are difficult to analyze directly. Nonlinear mixed effects (NLME) models are the standard approach for modeling such data, but they can be computationally intensive and may not scale well with large datasets or complex models. Some machine learning (ML) methods can be useful in eliminating useless covariates and biomarkers for a relatively low computational budget. Many ML models require fixed-size (i.e. tabular) data as inputs and outputs. A tabular representation of a usually more complex data structure is commonly known as an embedding. In this work, we aim to generate dissimilarity-preserving embeddings for longitudinal data commonly used in pharmacometrics. This has potential applications in covariate and biomarker filtering as well as model evaluation, to be investigated in future works.

Methods: Multidimensional scaling (MDS) [1] is a set of techniques that can be used to create dissimilarity-preserving embeddings for data points. Metric MDS (MMDS) is a variant of MDS that only requires a pairwise distance/dissimilarity matrix between the points/subjects as an input. For temporal variables, dynamic time warping (DTW) [2] can be used to compute pairwise dissimilarities between subjects, based on their time-varying observations. The output of MMDS is a fixed-size embedding for each subject. MMDS formulates the pairwise dissimilarity preservation problem as a non-convex optimization problem, choosing the embeddings that locally minimize the sum of square of the difference in dissimilarities in the original and embedding spaces. We implemented MMDS and applied it to a synthetic dataset, simulated from a 2-compartment population pharmacokinetic model. Each subject had 11 to 15 observations at random time points. Different population sizes were tried. The embeddings were initialized from a standard normal distribution and scaled by the average pairwise dissimilarity and the embedding dimension. The non-convex optimization problem was solved using limited-memory BFGS (L-BFGS). The MMDS final objective value was used to quantify the quality of the embeddings. However, since DTW is not a proper distance metric, a perfect objective of 0 is generally not possible to achieve even for a high embedding dimension. So to better highlight the value of the embeddings, we studied how well neighborhood structures were preserved in the embedding space compared to the original space. We identified the k = 5 nearest neighbors of each subject in both the original and embedding spaces. Then, we computed the percentage of neighbors that were common between the two spaces. The average overlap percentage between neighborhoods should give an intuitive measure of how well the embeddings preserved local structure. Additionally, we calculated and compared the average DTW dissimilarity between each subject and its 2 sets of neighbors, one set for each space. The averages were compared using a scatter plot, with each point x_i representing the average dissimilarity to the subject’s neighbors in both spaces (x_i[1] and x_i[2]) for a given subject i.

Results: The results show an average neighborhood overlap of 60% for a population size of 30 subjects and embedding dimension >= 3. Increasing the embedding dimension beyond 3 had no significant effect on the result. However, increasing the population size resulted in a significant drop in the neighborhood overlap to 16%. On the other hand, the scatter plot of average DTW dissimilarity for each space showed a good agreement for different population sizes even when the overlap percentage dropped. This is also seen by the correlation between DTW dissimilarities across spaces: 0.816 for one-dimensional embedding; and 0.993 for embeddings with 3 or more dimensions. While the results are not perfect, they indicate that the embeddings can partially preserve local structure. This is promising enough to justify using them for downstream ML tasks such as covariate and biomarker filtering, which we plan to investigate in future work.

Conclusions: We used MMDS and DTW on longitudinal PK data to obtain tabular embeddings. An experiment was performed to validate that the neighborhood structure is preserved in the embedding space. The results indicate that the proposed procedure can generate useful but not perfect dissimilarity-preserving embeddings for longitudinal data.

References:
[1] Borg, I., Groenen, P. Modern Multidimensional Scaling: Theory and Applications. 2nd ed. 2005. Springer. https://doi.org/10.1007/0-387-28981-X.
[2] Berndt, D. J., Clifford, J. Using Dynamic Time Warping to Find Patterns in Time Series. Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining. 1994. pp. 359–370.

Reference: PAGE 34 (2026) Abstr 12198 [www.page-meeting.org/?abstract=12198]

Poster: Methodology – AI/Machine Learning