4.5 Article

From co-expression to co-regulation: how many microarray experiments do we need?

Journal

GENOME BIOLOGY
Volume 5, Issue 7, Pages -

Publisher

BMC
DOI: 10.1186/gb-2004-5-7-r48

Keywords

-

Funding

  1. NATIONAL CANCER INSTITUTE [K25CA106988] Funding Source: NIH RePORTER
  2. NATIONAL HEART, LUNG, AND BLOOD INSTITUTE [R01HL072370, P50HL073996] Funding Source: NIH RePORTER
  3. NATIONAL HUMAN GENOME RESEARCH INSTITUTE [R21HG002849] Funding Source: NIH RePORTER
  4. NATIONAL INSTITUTE OF ALLERGY AND INFECTIOUS DISEASES [P01AI052106, U54AI057141, R21AI052028] Funding Source: NIH RePORTER
  5. NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES [U24DK058813] Funding Source: NIH RePORTER
  6. NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES [U19ES011387] Funding Source: NIH RePORTER
  7. NATIONAL INSTITUTE ON DRUG ABUSE [P30DA015625] Funding Source: NIH RePORTER
  8. NCI NIH HHS [K25 CA106988, K25 CA106988-01] Funding Source: Medline
  9. NHGRI NIH HHS [1R21HG002849-01, R21 HG002849] Funding Source: Medline
  10. NHLBI NIH HHS [P50 HL073996, 1P50HL073996-01, R01 HL072370, 5R01HL072370-02] Funding Source: Medline
  11. NIAID NIH HHS [1R21AI052028-01, U54 AI057141, 5P01 AI052106-02, 1U54AI057141-01, P01 AI052106] Funding Source: Medline
  12. NIDA NIH HHS [P30 DA015625, 1 P30 DA015625-01] Funding Source: Medline
  13. NIDDK NIH HHS [5U24DK058813-03] Funding Source: Medline
  14. NIEHS NIH HHS [1U19ES011387-02, U19 ES011387] Funding Source: Medline

Ask authors/readers for more resources

Background: Cluster analysis is often used to infer regulatory modules or biological function by associating unknown genes with other genes that have similar expression patterns and known regulatory elements or functions. However, clustering results may not have any biological relevance. Results: We applied various clustering algorithms to microarray datasets with different sizes, and we evaluated the clustering results by determining the fraction of gene pairs from the same clusters that share at least one known common transcription factor. We used both yeast transcription factor databases (SCPD, YPD) and chromatin immunoprecipitation (ChIP) data to evaluate our clustering results. We showed that the ability to identify co-regulated genes from clustering results is strongly dependent on the number of microarray experiments used in cluster analysis and the accuracy of these associations plateaus at between 50 and 100 experiments on yeast data. Moreover, the model-based clustering algorithm MCLUST consistently outperforms more traditional methods in accurately assigning co-regulated genes to the same clusters on standardized data. Conclusions: Our results are consistent with respect to independent evaluation criteria that strengthen our confidence in our results. However, when one compares ChIP data to YPD, the false-negative rate is approximately 80% using the recommended p-value of 0.001. In addition, we showed that even with large numbers of experiments, the false-positive rate may exceed the true-positive rate. In particular, even when all experiments are included, the best results produce clusters with only a 28% true-positive rate using known gene transcription factor interactions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available