☆ 4.8 Article

Improving GWAS discovery and genomic prediction accuracy in biobank data

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2022)

Journal

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

Volume 119, Issue 31, Pages -

Publisher

NATL ACAD SCIENCES

DOI: 10.1073/pnas.2121279119

Keywords

genomic prediction; association study; Bayesian penalized regression

Funding

Swiss National Science Foundation EccellenzaGrant [PCEGP3-181181]
Australian National Health and Medical Research Council [1113400]
Australian Research Council [FL180100072]
Estonian Research Council [PRG687]
Institute of Science and Technology Austria
Swiss National Science Foundation (SNF) [PCEGP3_181181] Funding Source: Swiss National Science Foundation (SNF)
Australian Research Council [FL180100072] Funding Source: Australian Research Council

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Genetically informed, deep-phenotyped biobanks are an important research resource, and the recently developed Bayesian grouped mixture of regressions model (GMRM) has been shown to achieve the highest genomic prediction accuracy to date. Comparing to other approaches, GMRM outperforms annotation prediction models by 15-18% and improves the discovery of independent loci by 62-65%. The study emphasizes the importance of incorporating MAF and LD information in genetic associations for both genomic prediction and discovery in large-scale individual-level studies.

Genetically informed, deep-phenotyped biobanks are an important research resource and it is imperative that the most powerful, versatile, and efficient analysis approaches are used. Here, we apply our recently developed Bayesian grouped mixture of regressions model (GMRM) in the UK and Estonian Biobanks and obtain the highest genomic prediction accuracy reported to date across 21 heritable traits. When compared to other approaches, GMRM accuracy was greater than annotation prediction models run in the LDAK or LDPred-funct software by 15% (SE 7%) and 14% (SE 2%), respectively, and was 18% (SE 3%) greater than a baseline BayesR model without single-nucleotide polymorphism (SNP) markers grouped into minor allele frequency-linkage disequilibrium (MAF-LD) annotation categories. For height, the prediction accuracy R-2 was 47% in a UK Biobank holdout sample, which was 76% of the estimated h(2) SNP. We then extend our GMRM prediction model to provide mixed-linear model association (MLMA) SNP marker estimates for genome-wide association (GWAS) discovery, which increased the independent loci detected to 16,162 in unrelated UK Biobank individuals, compared to 10,550 from BoltLMM and 10,095 from Regenie, a 62 and 65% increase, respectively. Theaverage chi(2) value of the leading markers increased by 15.24 (SE 0.41) for every 1% increase in prediction accuracy gained over a baseline BayesR model across the traits. Thus, we show that modeling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for both genomic prediction and discovery in large-scale individual-level studies.

Improving GWAS discovery and genomic prediction accuracy in biobank data

Journal

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

Publisher

NATL ACAD SCIENCES

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Improving GWAS discovery and genomic prediction accuracy in biobank data

Journal

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

Publisher

NATL ACAD SCIENCES

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper