4.7 Article

Identification of causal genes for complex traits

期刊

BIOINFORMATICS
卷 31, 期 12, 页码 206-213

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btv240

关键词

-

资金

  1. National Science Foundation [0513612, 0731455, 0729049, 0916676, 1065276, 1302448, 1320589]
  2. National Institutes of Health [K25-HL080079, U01-DA024417, P01-HL30568, P01-HL28481, R01-GM083198, R01-MH101782, R01-ES022282, R01 GM053275]
  3. NIH [U54EB020403]
  4. National Institute of Neurological Disorders and Stroke Informatics Center for Neurogenetics and Neurogenomics [P30 NS062691, T32 NS048004-09]
  5. Direct For Computer & Info Scie & Enginr [1320589] Funding Source: National Science Foundation
  6. Div Of Information & Intelligent Systems [1320589] Funding Source: National Science Foundation

向作者/读者索取更多资源

Motivation: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. Results: In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability rho. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据