4.5 Article

A practical guide to methods controlling false discoveries in computational biology

Journal

GENOME BIOLOGY
Volume 20, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/s13059-019-1716-1

Keywords

Multiple hypothesis testing; False discovery rate; RNA-seq; ScRNA-seq; ChIP-seq; Microbiome; GWAS; Gene set analysis

Funding

  1. US National Institutes of Health [U41HG004059, R02HG005220, R01GM083084, R01GM103552, R00HG009007]
  2. Chan Zuckerberg Initiative DAF grants [2018-183142, 2018-183201, 2018-183560]
  3. Broadnext10 Award
  4. Department of Defense through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program
  5. Siebel Scholars Foundation
  6. Moffitt Cancer Center [NCI P30CA076292]
  7. ENIGMA - Ecosystems and Networks Integrated with Genes and Molecular Assemblies
  8. Scientific Focus Area Program at Lawrence Berkeley National Laboratory - US Department of Energy, Office of Science, Office of Biological & Environmental Research [DE-AC02-05CH11231]

Ask authors/readers for more resources

BackgroundIn high-throughput studies, hundreds to millions of hypotheses are typically tested. Statistical methods that control the false discovery rate (FDR) have emerged as popular and powerful tools for error rate control. While classic FDR methods use only p values as input, more modern FDR methods have been shown to increase power by incorporating complementary information as informative covariates to prioritize, weight, and group hypotheses. However, there is currently no consensus on how the modern methods compare to one another. We investigate the accuracy, applicability, and ease of use of two classic and six modern FDR-controlling methods by performing a systematic benchmark comparison using simulation studies as well as six case studies in computational biology.ResultsMethods that incorporate informative covariates are modestly more powerful than classic approaches, and do not underperform classic approaches, even when the covariate is completely uninformative. The majority of methods are successful at controlling the FDR, with the exception of two modern methods under certain settings. Furthermore, we find that the improvement of the modern FDR methods over the classic methods increases with the informativeness of the covariate, total number of hypothesis tests, and proportion of truly non-null hypotheses.ConclusionsModern FDR methods that use an informative covariate provide advantages over classic FDR-controlling procedures, with the relative gain dependent on the application and informativeness of available covariates. We present our findings as a practical guide and provide recommendations to aid researchers in their choice of methods to correct for false discoveries.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available