4.7 Article

Machine learning integration of multimodal data identifies key features of blood pressure regulation

Journal

EBIOMEDICINE
Volume 84, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.ebiom.2022.104243

Keywords

Blood pressure; Machine learning; Genomics; Metabolomics; Diet

Funding

  1. Wellcome Trust
  2. Medical Research Council (MRC)/British Heart Foundation (BHF) Ancestry and Biological Informative Markers for Stratification of Hypertension (AIM-HY)
  3. European Union
  4. Chronic Disease Research Foundation (CDRF)
  5. Zoe Global Ltd. [212904/Z/18/Z]
  6. NIHR Clinical Research Facility and Biomedical Research Centre [MR/M016560/1]
  7. Qatar Foundation [733100]
  8. Chronic Disease Research Foundation
  9. MRC AIM-HY
  10. National Institute for Health Research (NIHR)
  11. Bio Resource
  12. Clinical Research Facility and Biomedical Research Centre based at Guy's and St Thomas' NHS Foundation Trust
  13. King's College London
  14. Medical Research Council
  15. British Heart Foundation
  16. Chief Scientist Office, Scotland
  17. Health Data Research UK
  18. Biomedical Research Program at Weill Cornell Medicine in Qatar
  19. Qatar Foundation [PG/12/85/29925, CS/16/1/31878, RE/18/6/34217]
  20. Qatar National Research Fund (QNRF)
  21. [HDR-5012]
  22. [NPRP11C-0115-180010]

Ask authors/readers for more resources

By integrating biochemical and dietary data, this study identifies the multifactorial contributors to blood pressure. Machine learning algorithms are used to identify important features and highlight the incremental value of each dimension. The findings are validated in an independent dataset, showing overlapping features between cohorts.
Background Association studies have identified several biomarkers for blood pressure and hypertension, but a thorough understanding of their mutual dependencies is lacking. By integrating two different high-throughput datasets, biochemical and dietary data, we aim to understand the multifactorial contributors of blood pressure (BP). Methods We included 4,863 participants from TwinsUK with concurrent BP, metabolomics, genomics, biochemical measures, and dietary data. We used 5-fold cross-validation with the machine learning XGBoost algorithm to identify features of importance in context of one another in TwinsUK (80% training, 20% test). The features tested in Twin-sUK were then probed using the same algorithm in an independent dataset of 2,807 individuals from the Qatari Bio-bank (QBB). Findings Our model explained 39.2% [4.5%, MAE:11.32 mmHg (95%CI, +/-0.65)] of the variance in systolic BP (SBP) in TwinsUK. Of the top 50 features, the most influential non-demographic variables were dihomo-linolenate, cis-4-decenoyl carnitine, lactate, chloride, urate, and creatinine along with dietary intakes of total, trans and saturated fat. We also highlight the incremental value of each included dimension. Furthermore, we replicated our model in the QBB [SBP variance explained = 45.2% (13.39%)] cohort and 30 of the top 50 features overlapped between cohorts. Interpretation We show that an integrated analysis of omics, biochemical and dietary data improves our understanding of their in-between relationships and expands the range of potential biomarkers for blood pressure. Our results point to potentially key biological pathways to be prioritised for mechanistic studies. Copyright (C) 2022 The Authors. Published by Elsevier B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available