4.7 Article

Informed and automated k-mer size selection for genome assembly

Journal

BIOINFORMATICS
Volume 30, Issue 1, Pages 31-37

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btt310

Keywords

-

Funding

  1. Genomics Institute of the Huck Institutes of the Life Sciences

Ask authors/readers for more resources

Motivation: Genome assembly tools based on the de Bruijn graph framework rely on a parameter k, which represents a trade-off between several competing effects that are difficult to quantify. There is currently a lack of tools that would automatically estimate the best k to use and/or quickly generate histograms of k-mer abundances that would allow the user to make an informed decision. Results: We develop a fast and accurate sampling method that constructs approximate abundance histograms with several orders of magnitude performance improvement over traditional methods. We then present a fast heuristic that uses the generated abundance histograms for putative k values to estimate the best possible value of k. We test the effectiveness of our tool using diverse sequencing data-sets and find that its choice of k leads to some of the best assemblies.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available