4.8 Article

Ranking causal variants and associated regions in genome-wide association studies by the support vector machine and random forest

期刊

NUCLEIC ACIDS RESEARCH
卷 39, 期 9, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkr064

关键词

-

资金

  1. National Science Foundation [EF0331654]
  2. NJIT
  3. Wellcome Trust [076113]
  4. U.S. National Science Foundation
  5. U.S. National Institutes of Health

向作者/读者索取更多资源

We study the number of causal variants and associated regions identified by top SNPs in rankings given by the popular 1 df chi-squared statistic, support vector machine (SVM) and the random forest (RF) on simulated and real data. If we apply the SVM and RF to the top 2r chi-square-ranked SNPs, where r is the number of SNPs with P-values within the Bonferroni correction, we find that both improve the ranks of causal variants and associated regions and achieve higher power on simulated data. These improvements, however, as well as stability of the SVM and RF rankings, progressively decrease as the cutoff increases to 5r and 10r. As applications we compare the ranks of previously replicated SNPs in real data, associated regions in type 1 diabetes, as provided by the Type 1 Diabetes Consortium, and disease risk prediction accuracies as given by top ranked SNPs by the three methods. Software and webserver are available at http://svmsnps.njit.edu.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据