4.7 Article

Discriminative motif analysis of high-throughput dataset

Journal

BIOINFORMATICS
Volume 30, Issue 6, Pages 775-783

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btt615

Keywords

-

Funding

  1. Interdisciplinary Training Program [T32 CA080416]
  2. Developmental Biology Predoctoral Training Grant [T32HD007183]
  3. National Institutes of Health NIAMS [R01AR045113]

Ask authors/readers for more resources

Motivation: High-throughput ChIP-seq studies typically identify thousands of peaks for a single transcription factor (TF). It is common for traditional motif discovery tools to predict motifs that are statistically significant against a naive background distribution but are of questionable biological relevance. Results: We describe a simple yet effective algorithm for discovering differential motifs between two sequence datasets that is effective in eliminating systematic biases and scalable to large datasets. Tested on 207 ENCODE ChIP-seq datasets, our method identifies correct motifs in 78% of the datasets with known motifs, demonstrating improvement in both accuracy and efficiency compared with DREME, another state-of-art discriminative motif discovery tool. More interestingly, on the remaining more challenging datasets, we identify common technical or biological factors that compromise the motif search results and use advanced features of our tool to control for these factors. We also present case studies demonstrating the ability of our method to detect single base pair differences in DNA specificity of two similar TFs. Lastly, we demonstrate discovery of key TF motifs involved in tissue specification by examination of high-throughput DNase accessibility data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available