Journal
JOURNAL OF COMPUTATIONAL BIOLOGY
Volume 16, Issue 7, Pages 897-908Publisher
MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2009.0005
Keywords
genome assembly; sequencing; parametric complexity; Chinese Postman problem
Categories
Funding
- Henry Jackson Foundation [0000150300]
- [IIS-0812111]
- Direct For Computer & Info Scie & Enginr
- Div Of Information & Intelligent Systems [0812111] Funding Source: National Science Foundation
- Direct For Computer & Info Scie & Enginr
- Div Of Information & Intelligent Systems [0844494] Funding Source: National Science Foundation
Ask authors/readers for more resources
In recent years, a flurry of new DNA sequencing technologies have altered the landscape of genomics, providing a vast amount of sequence information at a fraction of the costs that were previously feasible. The task of assembling these sequences into a genome has, however, still remained an algorithmic challenge that is in practice answered by heuristic solutions. In order to design better assembly algorithms and exploit the characteristics of sequence data from new technologies, we need an improved understanding of the parametric complexity of the assembly problem. In this article, we provide a first theoretical study in this direction, exploring the connections between repeat complexity, read lengths, overlap lengths and coverage in determining the hard'' instances of the assembly problem. Our work suggests at least two ways in which existing assemblers can be extended in a rigorous fashion, in addition to delineating directions for future theoretical investigations.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available