☆ 4.3 Article

Methods for Collapsing Multiple Rare Variants in Whole-Genome Sequence Data

GENETIC EPIDEMIOLOGY (2014)

Journal

GENETIC EPIDEMIOLOGY

Volume 38, Issue -, Pages S13-S20

Publisher

WILEY-BLACKWELL

DOI: 10.1002/gepi.21820

Keywords

Genetic Analysis Workshop 18; rare variants; whole-genome sequence; burden tests; nonburden tests

Funding

National Institutes of Health (NIH) [R01 GM031575, HL086694, HL107552, HL111249]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Genetic Analysis Workshop 18 provided whole-genome sequence data in a pedigree-based sample and longitudinal phenotype data for hypertension and related traits, presenting an excellent opportunity for evaluating analysis choices. We summarize the nine contributions to the working group on collapsing methods, which evaluated various approaches for the analysis of multiple rare variants. One contributor defined a variant prioritization scheme, whereas the remaining eight contributors evaluated statistical methods for association analysis. Six contributors chose the gene as the genomic region for collapsing variants, whereas three contributors chose nonoverlapping sliding windows across the entire genome. Statistical methods spanned most of the published methods, including well-established burden tests, variance-components-type tests, and recently developed hybrid approaches. Lesser known methods, such as functional principal components analysis, higher criticism, and homozygosity association, and some newly introduced methods were also used. We found that performance of these methods depended on the characteristics of the genomic region, such as effect size and direction of variants under consideration. Except for MAP4 and FLT3, the performance of all statistical methods to identify rare casual variants was disappointingly poor, providing overall power almost identical to the type I error. This poor performance may have arisen from a combination of (1) small sample size, (2) small effects of most of the causal variants, explaining a small fraction of variance, (3) use of incomplete annotation information, and (4) linkage disequilibrium between causal variants in a gene and noncausal variants in nearby genes. Our findings demonstrate challenges in analyzing rare variants identified from sequence data. (C) 2014 Wiley Periodicals, Inc.

Methods for Collapsing Multiple Rare Variants in Whole-Genome Sequence Data

Journal

GENETIC EPIDEMIOLOGY

Publisher

WILEY-BLACKWELL

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Methods for Collapsing Multiple Rare Variants in Whole-Genome Sequence Data

Journal

GENETIC EPIDEMIOLOGY

Publisher

WILEY-BLACKWELL

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper