☆ 4.3 Article

Statistical learning for sparser fine-mapped polygenic models: The prediction of LDL-cholesterol

GENETIC EPIDEMIOLOGY (2022)

Journal

GENETIC EPIDEMIOLOGY

Volume 46, Issue 8, Pages 589-603

Publisher

WILEY

DOI: 10.1002/gepi.22495

Keywords

boosting; polygenic score; stochastic search; UK Biobank; variable selection

Funding

BONFOR-program of the Medical Faculty University of Bonn [O-147.0002]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study proposes and applies a three-step strategy based on existing statistical learning methods to derive sparse models for genome-wide data with a polygenic signal. By using marginal screening, fine-mapping, and statistical boosting, this approach selects and fits multivariable regression models, improving the prediction performance and sparsity of polygenic risk scores.

Polygenic risk scores quantify the individual genetic predisposition regarding a particular trait. We propose and illustrate the application of existing statistical learning methods to derive sparser models for genome-wide data with a polygenic signal. Our approach is based on three consecutive steps. First, potentially informative loci are identified by a marginal screening approach. Then, fine-mapping is independently applied for blocks of variants in linkage disequilibrium, where informative variants are retrieved by using variable selection methods including boosting with probing and stochastic searches with the Adaptive Subspace method. Finally, joint prediction models with the selected variants are derived using statistical boosting. In contrast to alternative approaches relying on univariate summary statistics from genome-wide association studies, our three-step approach enables to select and fit multivariable regression models on large-scale genotype data. Based on UK Biobank data, we develop prediction models for LDL-cholesterol as a continuous trait. Additionally, we consider a recent scalable algorithm for the Lasso. Results show that statistical learning approaches based on fine-mapping of genetic signals result in a competitive prediction performance compared to classical polygenic risk approaches, while yielding sparser risk models.

Statistical learning for sparser fine-mapped polygenic models: The prediction of LDL-cholesterol

Journal

GENETIC EPIDEMIOLOGY

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Statistical learning for sparser fine-mapped polygenic models: The prediction of LDL-cholesterol

Journal

GENETIC EPIDEMIOLOGY

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper