4.7 Article

A Fast and Accurate Method for Genome-wide Scale Phenome-wide G x E Analysis and Its Application to UK Biobank

期刊

AMERICAN JOURNAL OF HUMAN GENETICS
卷 105, 期 6, 页码 1182-1192

出版社

CELL PRESS
DOI: 10.1016/j.ajhg.2019.10.008

关键词

-

资金

  1. National Institutes of Health [R01 HG008773]
  2. [45227]

向作者/读者索取更多资源

The etiology of most complex diseases involves genetic variants, environmental factors, and gene-environment interaction (G x F.) effects. Compared with marginal genetic association studies, G x F. analysis requires more samples and detailed measure of environmental exposures, and this limits the possible discoveries. Large-scale population-based biobanks with detailed phenotypic and environmental information, such as UK-Biobank, can be ideal resources for identifying G x F. effects. However, due to the large computation cost and the presence of case-control imbalance, existing methods often fail. Here we propose a scalable and accurate method, SPAGE (SaddlePoint Approximation implementation of G x F. analysis), that is applicable for genome-wide scale phenome-wide G x F. studies. SPAGE fits a genotype-independent logistic model only once across the genome-wide analysis in order to reduce computation cost, and SPAGE uses a saddlepoint approximation (SPA) to calibrate the test statistics for analysis of phenotypes with unbalanced case-control ratios. Simulation studies show that SPAGE is 33-79 times faster than the Wald test and 72-439 times faster than the Firth's test, and SPAGE can control type I error rates at the genome-wide significance level even when case-control ratios are extremely unbalanced. Through the analysis of UK-Biobank data of 344,341 white British European-ancestry samples, we show that SPAGE can efficiently analyze large samples while controlling for unbalanced case-control ratios.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据