4.7 Article

Comparative studies of de novo assembly tools for next-generation sequencing technologies

Journal

BIOINFORMATICS
Volume 27, Issue 15, Pages 2031-2037

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btr319

Keywords

-

Funding

  1. Shanghai Leading Academic Discipline [S30501]
  2. Shanghai University of Science and Technology
  3. NIH [P50AR055081, R01AG026564, R01AR050496, RC2DE020756, R01AR057049, R03TW008221]

Ask authors/readers for more resources

Motivation: Several new de novo assembly tools have been developed recently to assemble short sequencing reads generated by next-generation sequencing platforms. However, the performance of these tools under various conditions has not been fully investigated, and sufficient information is not currently available for informed decisions to be made regarding the tool that would be most likely to produce the best performance under a specific set of conditions. Results: We studied and compared the performance of commonly used de novo assembly tools specifically designed for next-generation sequencing data, including SSAKE, VCAKE, Euler-sr, Edena, Velvet, ABySS and SOAPdenovo. Tools were compared using several performance criteria, including N50 length, sequence coverage and assembly accuracy. Various properties of read data, including single-end/paired-end, sequence GC content, depth of coverage and base calling error rates, were investigated for their effects on the performance of different assembly tools. We also compared the computation time and memory usage of these seven tools. Based on the results of our comparison, the relative performance of individual tools are summarized and tentative guidelines for optimal selection of different assembly tools, under different conditions, are provided.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available