4.5 Article

Parametric Complexity of Sequence Assembly: Theory and Applications to Next Generation Sequencing

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY
Volume 16, Issue 7, Pages 897-908

Publisher

MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2009.0005

Keywords

genome assembly; sequencing; parametric complexity; Chinese Postman problem

Funding

  1. Henry Jackson Foundation [0000150300]
  2. [IIS-0812111]
  3. Direct For Computer & Info Scie & Enginr
  4. Div Of Information & Intelligent Systems [0812111] Funding Source: National Science Foundation
  5. Direct For Computer & Info Scie & Enginr
  6. Div Of Information & Intelligent Systems [0844494] Funding Source: National Science Foundation

Ask authors/readers for more resources

In recent years, a flurry of new DNA sequencing technologies have altered the landscape of genomics, providing a vast amount of sequence information at a fraction of the costs that were previously feasible. The task of assembling these sequences into a genome has, however, still remained an algorithmic challenge that is in practice answered by heuristic solutions. In order to design better assembly algorithms and exploit the characteristics of sequence data from new technologies, we need an improved understanding of the parametric complexity of the assembly problem. In this article, we provide a first theoretical study in this direction, exploring the connections between repeat complexity, read lengths, overlap lengths and coverage in determining the hard'' instances of the assembly problem. Our work suggests at least two ways in which existing assemblers can be extended in a rigorous fashion, in addition to delineating directions for future theoretical investigations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available