4.0 Article

Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts

期刊

BMC SYSTEMS BIOLOGY
卷 11, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/s12918-017-0474-5

关键词

Coronary artery calcium; Random forest; Neural networks; Case-control study; Coronary heart disease; Genotype data; Systems biology

资金

  1. National Human Genome Research Institute of the National Institutes of Health [HG200393]
  2. National Heart, Lung, and Blood Institute (NHLBI)
  3. Boston University [N01-HC-25195]
  4. NHLBI [N02-HL-64278]
  5. NIH [HG200359 08, HG200387 03]

向作者/读者索取更多资源

Background: One goal of personalized medicine is leveraging the emerging tools of data science to guide medical decision-making. Achieving this using disparate data sources is most daunting for polygenic traits. To this end, we employed random forests (RFs) and neural networks (NNs) for predictive modeling of coronary artery calcium (CAC), which is an intermediate endo-phenotype of coronary artery disease (CAD). Methods: Model inputs were derived from advanced cases in the ClinSeq (R); discovery cohort (n=16) and the FHS replication cohort (n=36) from 89th -99th CAC score percentile range, and age-matched controls (ClinSeq (R); n=16, FHS n=36) with no detectable CAC (all subjects were Caucasian males). These inputs included clinical variables and genotypes of 56 single nucleotide polymorphisms (SNPs) ranked highest in terms of their nominal correlation with the advanced CAC state in the discovery cohort. Predictive performance was assessed by computing the areas under receiver operating characteristic curves (ROC-AUC). Results: RF models trained and tested with clinical variables generated ROC-AUC values of 0.69 and 0.61 in the discovery and replication cohorts, respectively. In contrast, in both cohorts, the set of SNPs derived from the discovery cohort were highly predictive (ROC-AUC >= 0.85) with no significant change in predictive performance upon integration of clinical and genotype variables. Using the 21 SNPs that produced optimal predictive performance in both cohorts, we developed NN models trained with ClinSeq (R); data and tested with FHS data and obtained high predictive accuracy (ROC-AUC=0.80-0.85) with several topologies. Several CAD and vascular aging related biological processes were enriched in the network of genes constructed from the predictive SNPs. Conclusions: We identified a molecular network predictive of advanced coronary calcium using genotype data from ClinSeq (R); and FHS cohorts. Our results illustrate that machine learning tools, which utilize complex interactions between disease predictors intrinsic to the pathogenesis of polygenic disorders, hold promise for deriving predictive disease models and networks.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据