4.7 Article Proceedings Paper

A graph-based approach to systematically reconstruct human transcriptional regulatory modules

期刊

BIOINFORMATICS
卷 23, 期 13, 页码 I577-I586

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btm227

关键词

-

资金

  1. NCI NIH HHS [U54CA112952] Funding Source: Medline
  2. NHGRI NIH HHS [P50HG002790] Funding Source: Medline
  3. NIGMS NIH HHS [R01GM074163] Funding Source: Medline

向作者/读者索取更多资源

Motivation: A major challenge in studying gene regulation is to systematically reconstruct transcription regulatory modules, which are defined as sets of genes that are regulated by a common set of transcription factors. A commonly used approach for transcription module reconstruction is to derive coexpression clusters from a microarray dataset. However, such results often contain false positives because genes from many transcription modules may be simultaneously perturbed upon a given type of conditions. In this study, we propose and validate that genes, which form a coexpression cluster in multiple microarray datasets across diverse conditions, are more likely to form a transcription module. However, identifying genes coexpressed in a subset of many microarray datasets is not a trivial computational problem. Results: We propose a graph-based data-mining approach to efficiently and systematically identify frequent coexpression clusters. Given m microarray datasets, we model each microarray dataset as a coexpression graph, and search for vertex sets which are frequently densely connected across [theta m] datasets (0 <= theta <= 1). For this novel graph-mining problem, we designed two techniques to narrow down the search space: ( 1) partition the input graphs into ( overlapping) groups sharing common properties; ( 2) summarize the vertex neighbor information from the partitioned datasets onto the 'Neighbor Association Summary Graph's for effective mining. We applied our method to 105 human microarray datasets, and identified a large number of potential transcription modules, activated under different subsets of conditions. Validation by ChIP-chip data demonstrated that the likelihood of a coexpression cluster being a transcription module increases significantly with its recurrence. Our method opens a new way to exploit the vast amount of existing microarray data accumulation for gene regulation study. Furthermore, the algorithm is applicable to other biological networks for approximate network module mining.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据