4.7 Article

SNP detection for massively parallel whole-genome resequencing

期刊

GENOME RESEARCH
卷 19, 期 6, 页码 1124-1132

出版社

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT
DOI: 10.1101/gr.088013.108

关键词

-

资金

  1. Chinese Academy of Science [GJHZ0701-6, KSCX2-YWN-023]
  2. National Natural Science Foundation of China [30725008, 90403130, 90608010, 30221004, 90612019, 30392130]
  3. Chinese 973 program [2007CB815701, 2007CB815703, 2007CB815705]
  4. Chinese 863 program [2006AA02Z334, 2006AA10A121, 2006AA02Z177]
  5. Chinese Municipal Science and Technology Commission [D07030200740000]
  6. Danish Platform for Integrative Biology
  7. Ole Romer grant from the Danish Natural Science Research Council
  8. Danish Research Council and the Solexa [272-07-0196]
  9. Lundbeck Foundation Centre of Applied Medical Genomics for Personalized Disease Prediction
  10. Prevention and Care ( LUCAMP).

向作者/读者索取更多资源

Next-generation massively parallel sequencing technologies provide ultrahigh throughput at two orders of magnitude lower unit cost than capillary Sanger sequencing technology. One of the key applications of next-generation sequencing is studying genetic variation between individuals using whole-genome or target region resequencing. Here, we have developed a consensus-calling and SNP-detection method for sequencing-by-synthesis Illumina Genome Analyzer technology. We designed this method by carefully considering the data quality, alignment, and experimental errors common to this technology. All of this information was integrated into a single quality score for each base under Bayesian theory to measure the accuracy of consensus calling. We tested this methodology using a large-scale human resequencing data set of 36 x coverage and assembled a high-quality nonrepetitive consensus sequence for 92.25% of the diploid autosomes and 88.07% of the haploid X chromosome. Comparison of the consensus sequence with Illumina human 1M BeadChip genotyped alleles from the same DNA sample showed that 98.6% of the 37,933 genotyped alleles on the X chromosome and 98% of 999,981 genotyped alleles on autosomes were covered at 99.97% and 99.84% consistency, respectively. At a low sequencing depth, we used prior probability of dbSNP alleles and were able to improve coverage of the dbSNP sites significantly as compared to that obtained using a nonimputation model. Our analyses demonstrate that our method has a very low false call rate at any sequencing depth and excellent genome coverage at a high sequencing depth.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据