4.7 Article

M are better than one: an ensemble-based motif finder and its application to regulatory element prediction

期刊

BIOINFORMATICS
卷 25, 期 7, 页码 868-874

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btp090

关键词

-

资金

  1. Fred Hutchinson Cancer Research Center, Seattle, WA
  2. NIH Center of Excellence at Princeton University [P50 GM071508, HHSN266200500021C, GM076275]
  3. National Science Foundation [DGE-9972930, IIS-061223]
  4. NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES [P50GM071508, R01GM076275] Funding Source: NIH RePORTER

向作者/读者索取更多资源

Motivation: Identifying regulatory elements in genomic sequences is a key component in understanding the control of gene expression. Computationally, this problem is often addressed by motif discovery, where the goal is to find a set of mutually similar subsequences within a collection of input sequences. Though motif discovery is widely studied and many approaches to it have been suggested, it remains a challenging and as yet unresolved problem. Results: We introduce SAMF (Solution-Aggregating Motif Finder), a novel approach for motif discovery. SAMF is based on a Markov Random Field formulation, and its key idea is to uncover and aggregate multiple statistically significant solutions to the given motif finding problem. In contrast to many earlier methods, SAMF does not require prior estimates on the number of motif instances present in the data, is not limited by motif length, and allows motifs to overlap. Though SAMF is broadly applicable, these features make it particularly well suited for addressing the challenges of prokaryotic regulatory element detection. We test SAMFs ability to find transcription factor binding sites in an Escherichia coli dataset and show that it outperforms previous methods. Additionally, we uncover a number of previously unidentified binding sites in this data, and provide evidence that they correspond to actual regulatory elements.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据