A generative and causal pharmacokinetic model for haemophilia A: towards an unified model for all factor VIII concentrates.
A.Janssen1, F.C. Bennis2, M.H. Cnossen3, and R.A.A. Mathôt1 for the OPTI-CLOT study group and SYMPHONY consortium
 Department of Clinical Pharmacology, Hospital Pharmacy, Amsterdam UMC, University of Amsterdam, The Netherlands.  Quantitative Data Analytics Group, Department of Computer Science, VU Amsterdam, Amsterdam, The Netherlands.  Department of Pediatric Hematology, Erasmus MC Sophia Children’s Hospital, Erasmus University Medical Center Rotterdam, The Netherlands.
Several population pharmacokinetic (PK) models have been developed for the wide range of recombinant factor VIII (rFVIII) concentrates used to treat haemophilia A patients. However, some do not include the known causal effect of von Willebrand factor (VWF). VWF is a chaperone protein protecting FVIII from degradation, resulting in diminished clearance (CL) of FVIII. Unfortunately, VWF levels are rarely collected and thus frequently unavailable for model development. Alternatively, confounding variables such as age or blood group are included, which both are correlated to VWF but likely have no independent causal effect on FVIII CL. Furthermore, available PK models are often developed for a specific brand of rFVIII concentrate. Inclusion of confounders and not correcting for brand-specific differences in PK can negatively influence model performance on new data. For example, simply applying one of the available population PK models to predict FVIII levels can lead to bias.
In this work, we develop a causal, machine learning (ML) based population PK model for rFVIII which consists of three components: (1) a generative model, allowing for imputation of missing data and simulation of realistic virtual patients, (2) a predictive model, implementing causal relationships between covariates and PK parameters, and (3) a conversion model, correcting for differences in PK between rFVIII concentrates.
To inform the causal model, articles describing PK of rFVIII were identified using a literature search. Causal relationships between relevant patient characteristics (e.g. weight, height, age, blood group, and VWF) were described using a directed acyclic graph (DAG). In a DAG, nodes (covariates) are connected via edges, which describe the direction of the causal relationship. All edges in the DAG were supported by findings from previous studies. A generative model was constructed to allow for the prediction of all nodes in the DAG based on their parents using probabilistic regression methods (Gaussian Processes and linear mixed effects models). A large, public data set was gathered to build the generative model. Weight, height, and age data from 1635 males were obtained from the NHANES dataset , and 247 VWF levels along with age and blood group from hemophilia A patients were extracted from ten publications using WebPlotDigitizer .
Next, a predictive model was constructed using deep compartment models (DCMs), a ML technique that learns covariate effects directly from data . A special architecture was used, making the model fully interpretable while enabling estimation of random effects for Bayesian forecasting. A two compartment model was used and random effects were estimated for the CL and volume of distribution (V1) parameters. Data from three clinical trials evaluating rFVIII-SingleChain (Afstyla©) were provided by CSL Behring GmbH. The data set included weight, height, and VWF levels from severe haemophilia A patients (n = 120) with a median of 12 FVIII measurements per patient. Prior studies have suggested longer half-life of rFVIII-SingleChain compared to other standard half-life (SHL) rFVIII concentrates . A subset of the patients also received rFVIII (Advate©, n = 27) which was used to fit the conversion model. Model accuracy was validated on an independent dataset (n = 21) of severe haemophilia A patients who received rFVIII and rFVIII-FS (Kogenate©) . Model performance was also compared to four previously published PK models on this dataset [10-13]. Accuracy was represented by the root mean squared error (RMSE).
Accuracy of the generative model was determined by treating all VWF data as missing from the rFVIII-SingleChain data set and evaluating the mean absolute percentage error (MAPE) of imputed data. Since blood group data was missing, a Bayesian approach was taken to estimate the posterior distributions of random effects, blood group, and VWF based on patient age and FVIII levels. This was compared to a standard linear regression-based approach predicting VWF based on age and having blood group 0.
Based on the DAG, patient weight, height, and VWF were chosen to predict CL while weight and height were used to predict V1. Global parameters were estimated for intercompartmental clearance (Q) and peripheral volume (V2). A DCM was fit to predict rFVIII-SingleChain levels. Final RMSE of typical predictions was 11.2 IU/dL. Learned functions could be visualized and matched expectations about the causal effect of the covariates. Coefficient of variation of random effects on CL and V1 were estimated to be 23% and 18%, respectively. Estimated standard deviation of residual additive error was 3.6 IU/dL.
Next, the conversion model was created to adjust typical rFVIII-SingleChain PK parameters to rFVIII PK parameters based on the 27 patients who also received rFVIII. Estimated CL of rFVIII was 12% higher, V1 decreased by 25%, Q was halved, and V2 was almost doubled. After conversion, validation accuracy on the independent data set was very similar to accuracy on the train set (RMSE = 11.84 IU/dL). Typical predictions from our model outperformed those from the four previously published PK models (mean RMSE 14.4 IU/dL), with the most accurate alternative achieving a RMSE of 12.7 IU/dL.
When using regression to impute VWF levels, MAPE of predictions was 28.8% ± 22 when assuming all individuals had blood group non-0 and 28.9% ± 17 when assuming blood group 0. The median of the predicted VWF posterior distributions from the generative model had a MAPE of 17.7% ± 14, while also providing likelihoods for the patient having blood group 0.
The proposed causal model outperformed four previously published rFVIII models in terms of accuracy on the validation data set. Higher accuracy might be related to the use of causal covariates and flexibility of learning more complex relationships between covariates and PK parameters when using a ML-based approach. The proposed model is fully interpretable, increasing model trust. The model can be used for Bayesian forecasting, even in the presence of missing covariate data by use of the generative model. The generative model further allows for the simulation of realistic virtual patients, while also enabling training of the model on data sets that do not necessarily contain all covariates used in the model. Following from the causal DAG, posterior distributions for all covariates can even be produced based solely on patient age.
Future work includes setting up a federated learning environment to allow researchers worldwide to train the model while keeping their patient data private. Since we use DCMs, training is fully automated and would thus greatly simplify this procedure. By training the model on many different data sets, a truly unified PK model can be built, learning shared causal effects underlying the PK of rFVIII while also learning to correct for differences between concentrates.
 Goedhart, T. M., Bukkems, L. H., Zwaan, C. M., Mathôt, R. A., Cnossen, M. H., & OPTI-CLOT study group and SYMPHONY consortium. (2021). Population pharmacokinetic modeling of factor concentrates in hemophilia: an overview and evaluation of best practice. Blood Advances, 5(20), 4314-4325.
Peyvandi, F., Garagiola, I., & Baronciani, L. (2011). Role of von Willebrand factor in the haemostasis. Blood Transfusion, 9(Suppl 2), s3.
Franchini, M., Capra, F., Targher, G., Montagnana, M., & Lippi, G. (2007). Relationship between ABO blood group and von Willebrand factor levels: from biology to clinical implications. Thrombosis journal, 5, 1-5.
 Uster, D. W., Chowdary, P., Riddell, A., Garcia, C., Aradom, E., Musarara, M., & Wicha, S. G. (2022). Dosing for Personalized Prophylaxis in Hemophilia A highly varies on the underlying Population Pharmacokinetic Models. Therapeutic Drug Monitoring, 44(5), 665-673.
 Centers for Disease Control and Prevention (CDC). National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 2013-2014, https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/overview.aspx?BeginYear=2013 accessed on 11th of November 2022.
 Rohatgi, A.. (2022). Webplotdigitizer: Version 4.6.
 Janssen, A., Leebeek, F. W., Cnossen, M. H., Mathôt, R. A., OPTI‐CLOT study group and SYMPHONY consortium, Fijnvandraat, K., ... & Keeling, D. (2022). Deep compartment models: A deep learning approach for the reliable prediction of time‐series data in pharmacokinetic modeling. CPT: Pharmacometrics & Systems Pharmacology, 11(7), 934-945.
 Mahlangu, J., Young, G., Hermans, C., Blanchette, V., Berntorp, E., & Santagostino, E. (2018). Defining extended half‐life rFVIII—a critical review of the evidence. Haemophilia, 24(3), 348-358.
 van Moort, I., Preijers, T., Bukkems, L. H., Hazendonk, H. C., van der Bom, J. G., Laros-van Gorkom, B. A., ... & Keeling, D. (2021). Perioperative pharmacokinetic-guided factor VIII concentrate dosing in haemophilia (OPTI-CLOT trial): an open-label, multicentre, randomised, controlled trial. The Lancet Haematology, 8(7), e492-e502.
 Björkman, S., Oh, M., Spotts, G., Schroth, P., Fritsch, S., Ewenstein, B. M., ... & Collins, P. W. (2012). Population pharmacokinetics of recombinant factor VIII: the relationships of pharmacokinetics to age and body weight. Blood, The Journal of the American Society of Hematology, 119(2), 612-618.
 Nestorov, I., Neelakantan, S., Ludden, T. M., Li, S., Jiang, H., & Rogge, M. (2015). Population pharmacokinetics of recombinant factor VIII Fc fusion protein. Clinical pharmacology in drug development, 4(3), 163-174.
 McEneny-King, A., Chelle, P., Foster, G., Keepanasseril, A., Iorio, A., & Edginton, A. N. (2019). Development and evaluation of a generic population pharmacokinetic model for standard half-life factor VIII for use in dose individualization. Journal of Pharmacokinetics and Pharmacodynamics, 46, 411-426.
 Allard, Q., Djerada, Z., Pouplard, C., Repessé, Y., Desprez, D., Galinat, H., ... & Cazaubon, Y. (2020). Real life population pharmacokinetics modelling of eight factors VIII in patients with severe haemophilia A: is it always relevant to switch to an extended half-life?. Pharmaceutics, 12(4), 380.