4.0 Article

A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3

期刊

FLY
卷 6, 期 2, 页码 80-92

出版社

TAYLOR & FRANCIS INC
DOI: 10.4161/fly.19695

关键词

personal genomes; Drosophila melanogaster; whole-genome SNP analysis; next generation DNA sequencing

资金

  1. Michigan Core Technology grant from the State of Michigan's 21st Century Fund Program
  2. Environmental Health Sciences Center in Molecular and Cellular Toxicology with Human Applications Grant at Wayne State University [P30 ES06639]
  3. NIH [ES012933, DK071073]

向作者/读者索取更多资源

We describe a new computer program, SnpEff, for rapidly categorizing the effects of variants in genome sequences. Once a genome is sequenced, SnpEff annotates variants based on their genomic locations and predicts coding effects. Annotated genomic locations include intronic, untranslated region, upstream, downstream, splice site, or intergenic regions. Coding effects such as synonymous or non-synonymous amino acid replacement, start codon gains or losses, stop codon gains or losses, or frame shifts can be predicted. Here the use of SnpEff is illustrated by annotating similar to 356,660 candidate SNPs in similar to 117 Mb unique sequences, representing a substitution rate of similar to 1/305 nucleotides, between the Drosophila melanogaster w(1118); iso-2; iso-3 strain and the reference y(1); cn(1) bw(1) sp(1) strain. We show that similar to 15,842 SNPs are synonymous and similar to 4,467 SNPs are non-synonymous (N/S similar to 0.28). The remaining SNPs are in other categories, such as stop codon gains (38 SNPs), stop codon losses (8 SNPs), and start codon gains (297 SNPs) in the 5'UTR. We found, as expected, that the SNP frequency is proportional to the recombination frequency (i.e., highest in the middle of chromosome arms). We also found that start-gain or stop-lost SNPs in Drosophila melanogaster often result in additions of N-terminal or C-terminal amino acids that are conserved in other Drosophila species. It appears that the 5' and 3'UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus. As genome sequencing is becoming inexpensive and routine, SnpEff enables rapid analyses of whole-genome sequencing data to be performed by an individual laboratory.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据