4.5 Article

Sample size and power analysis for sparse signal recovery in genome-wide association studies

期刊

BIOMETRIKA
卷 98, 期 2, 页码 273-290

出版社

OXFORD UNIV PRESS
DOI: 10.1093/biomet/asr003

关键词

False discovery rate; False non-discovery rate; High-dimensional data; Multiple testing; Oracle exact recovery

资金

  1. National Institutes of Health
  2. National Science Foundation
  3. Direct For Mathematical & Physical Scien
  4. Division Of Mathematical Sciences [0854973] Funding Source: National Science Foundation

向作者/读者索取更多资源

Genome-wide association studies have successfully identified hundreds of novel genetic variants associated with many complex human diseases. However, there is a lack of rigorous work on evaluating the statistical power for identifying these variants. In this paper, we consider sparse signal identification in genome-wide association studies and present two analytical frameworks for detailed analysis of the statistical power for detecting and identifying the disease-associated variants. We present an explicit sample size formula for achieving a given false non-discovery rate while controlling the false discovery rate based on an optimal procedure. Sparse genetic variant recovery is also considered and a boundary condition is established in terms of sparsity and signal strength for almost exact recovery of both disease-associated variants and nondisease-associated variants. A data-adaptive procedure is proposed to achieve this bound. The analytical results are illustrated with a genome-wide association study of neuroblastoma.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据