期刊
FEMS MICROBIOLOGY LETTERS
卷 291, 期 1, 页码 103-111出版社
OXFORD UNIV PRESS
DOI: 10.1111/j.1574-6968.2008.01441.x
关键词
genome sequencing; de novo sequence assembly; Pseudomonas syringae; Bioinformatics; Illumina; Solexa
类别
资金
- Gatsby Charitable Foundation
- UK Biotechnology & Biological Sciences Research Council's
- DFG
Illumina's Genome Analyzer generates ultra-short sequence reads, typically 36 nucleotides in length, and is primarily intended for resequencing. We tested the potential of this technology for de novo sequence assembly on the 6 Mbp genome of Pseudomonas syringae pv. syringae B728a with several freely available assembly software packages. Using an unpaired data set, velvet assembled > 96% of the genome into contigs with an N50 length of 8289 nucleotides and an error rate of 0.33%. edena generated smaller contigs (N50 was 4192 nucleotides) and comparable error rates. ssake and vcakeyielded shorter contigs with very high error rates. Assembly of paired-end sequence data carrying 400 bp inserts produced longer contigs (N50 up to 15 628 nucleotides), but with increased error rates (0.5%). Contig length and error rate were very sensitive to the choice of parameter values. Noncoding RNA genes were poorly resolved in de novo assemblies, while > 90% of the protein-coding genes were assembled with 100% accuracy over their full length. This study demonstrates that, in practice, de novo assembly of 36-nucleotide reads can generate reasonably accurate assemblies from about 40 x deep sequence data sets. These draft assemblies are useful for exploring an organism's proteomic potential, at a very economic low cost.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据