4.7 Article Proceedings Paper

Efficient identification of DNA hybridization partners in a sequence database

Journal

BIOINFORMATICS
Volume 22, Issue 14, Pages E350-E358

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btl240

Keywords

-

Funding

  1. NHGRI NIH HHS [T32 HG00035, R33 HG003070] Funding Source: Medline
  2. NIGMS NIH HHS [R01 GM071923] Funding Source: Medline

Ask authors/readers for more resources

Motivation: The specific hybridization of complementary DNA molecules underlies many widely used molecular biology assays, including the polymerase chain reaction and various types of microarray analysis. In order for such an assay to work well, the primer or probe must bind to its intended target, without also binding to additional sequences in the reaction mixture. For any given probe or primer, potential non-specific binding partners can be identified using state-of-the- art models of DNA binding stability. Unfortunately, these models rely on dynamic programming algorithms that are too slow to apply on a genomic scale. Results: We present an algorithm that efficiently scans a DNA database for short (approximately 20-30 base) sequences that will bind to a query sequence. We use a filtering approach, in which a series of increasingly stringent filters is applied to a set of candidate k-mers. The k-mers that pass all filters are then located in the sequence database using a precomputed index, and an accurate model of DNA binding stability is applied to the sequence surrounding each of the k-mer occurrences. This approach reduces the time to identify all binding partners fora given DNA sequence in human genomic DNA by approximately three orders of magnitude, from two days for the ENCODE regions to less than one minute for typical queries. Our approach is scalable to large DNA sequences. Our method can scan the human genome for medium strength binding sites to a candidate PCR primer in an average of 34.5 minutes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available