4.8 Article

Profiling the genome-wide landscape of tandem repeat expansions

期刊

NUCLEIC ACIDS RESEARCH
卷 47, 期 15, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkz501

关键词

-

资金

  1. National Institutes of Health [DP5OD024577, R01HG010149]
  2. Extreme Science and Engineering Discovery Environment (XSEDE) [csd568, ddp268]
  3. National Science Foundation [ACI-1548562]
  4. NIH [DP5OD024577]

向作者/读者索取更多资源

Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington's Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide geno-typing of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据