☆ 4.6 Article

The Power of Gene-Based Rare Variant Methods to Detect Disease-Associated Variation and Test Hypotheses About Complex Disease

PLOS GENETICS (2015)

Journal

PLOS GENETICS

Volume 11, Issue 4, Pages -

Publisher

PUBLIC LIBRARY SCIENCE

DOI: 10.1371/journal.pgen.1005165

Keywords

Funding

Wellcome Trust [090367, 098381, 090532]
National Institutes of Health (NIH) [U01-DK085545, RC2-HG005688, HG000376, DK062370]
Doris Duke Charitable Foundation [2006087]
National Institute of Diabetes and Digestive and Kidney diseases (NIDDK) [R01-DK098032]
NIH [DK062370, T32GM007753, T32GM008313]
FWF [J-3401]
Clarendon Fund of the University of Oxford
Nuffield Department of Medicine
NIDDK [1RC2DK088389]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Genome and exome sequencing in large cohorts enables characterization of the role of rare variation in complex diseases. Success in this endeavor, however, requires investigators to test a diverse array of genetic hypotheses which differ in the number, frequency and effect sizes of underlying causal variants. In this study, we evaluated the power of gene-based association methods to interrogate such hypotheses, and examined the implications for study design. We developed a flexible simulation approach, using 1000 Genomes data, to (a) generate sequence variation at human genes in up to 10K case-control samples, and (b) quantify the statistical power of a panel of widely used gene-based association tests under a variety of allelic architectures, locus effect sizes, and significance thresholds. For loci explaining similar to 1% of phenotypic variance underlying a common dichotomous trait, we find that all methods have low absolute power to achieve exome-wide significance (similar to 5-20% power at alpha=2.5x10(-6)) in 3K individuals; even in 10K samples, power is modest (similar to 60%). The combined application of multiple methods increases sensitivity, but does so at the expense of a higher false positive rate. MiST, SKAT-O, and KBAC have the highest individual mean power across simulated datasets, but we observe wide architecture-dependent variability in the individual loci detected by each test, suggesting that inferences about disease architecture from analysis of sequencing studies can differ depending on which methods are used. Our results imply that tens of thousands of individuals, extensive functional annotation, or highly targeted hypothesis testing will be required to confidently detect or exclude rare variant signals at complex disease loci.

The Power of Gene-Based Rare Variant Methods to Detect Disease-Associated Variation and Test Hypotheses About Complex Disease

Journal

PLOS GENETICS

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

The Power of Gene-Based Rare Variant Methods to Detect Disease-Associated Variation and Test Hypotheses About Complex Disease

Journal

PLOS GENETICS

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper