4.8 Article

Efficient Bayesian mixed-model analysis increases association power in large cohorts

Journal

NATURE GENETICS
Volume 47, Issue 3, Pages 284-+

Publisher

NATURE PUBLISHING GROUP
DOI: 10.1038/ng.3190

Keywords

-

Funding

  1. US National Institutes of Health [R01 HG006399]
  2. US National Institutes of Health fellowship [F32 HG007805]
  3. Fannie and John Hertz Foundation
  4. National Heart, Lung, and Blood Institute [HL043851, HL080467]
  5. National Cancer Institute [CA047988]
  6. Donald W. Reynolds Foundation
  7. Fondation Leducq
  8. Amgen
  9. Direct For Biological Sciences
  10. Div Of Biological Infrastructure [1349449] Funding Source: National Science Foundation

Ask authors/readers for more resources

Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts and may not optimize power. All existing methods require time cost O(MN2) (where N is the number of samples and M is the number of SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here we present a far more efficient mixed-model association method, BOLT-LMM, which requires only a small number of O(MN) time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to 9 quantitative traits in 23,294 samples from the Women's Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for genome-wide association studies in large cohorts.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available