☆ 4.6 Article

Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets

HUMAN GENETICS (2012)

Journal

HUMAN GENETICS

Volume 131, Issue 5, Pages 747-756

Publisher

SPRINGER

DOI: 10.1007/s00439-011-1118-2

Keywords

Funding

HKU [7672/06 M, 201007176166]
European Community [HEALTH-F2-2010-241909]
University of Hong Kong Strategic Research Theme on Genomics

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Current genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (M (e)) for the adjustment of multiple testing, but current methods of calculation for M (e) are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate M (e). Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the M (e), and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of similar to 10(-7) as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds similar to 5 x 10(-8) for current or merged commercial genotyping arrays, similar to 10(-8) for all common SNPs in the 1000 Genomes Project dataset and similar to 5 x 10(-8) for the common SNPs only within genes.

Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets

Journal

HUMAN GENETICS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets

Journal

HUMAN GENETICS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper