☆ 4.7 Article

A high-performance computing toolset for relatedness and principal component analysis of SNP data

BIOINFORMATICS (2012)

Journal

BIOINFORMATICS

Volume 28, Issue 24, Pages 3326-3328

Publisher

OXFORD UNIV PRESS

DOI: 10.1093/bioinformatics/bts606

Keywords

Funding

US National Institutes of Health, Genes, Environment and Health Initiative
Genetics Coordinating Center [U01 HG 004446]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

SUMMARY: Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed gdsfmt and SNPRelate (R packages for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized. Benchmarks show the uniprocessor implementations of PCA and identity-by-descent are 8-50 times faster than the implementations provided in the popular EIGENSTRAT (v3.0) and PLINK (v1.07) programs, respectively, and can be sped up to 30-300-fold by using eight cores. SNPRelate can analyse tens of thousands of samples with millions of SNPs. For example, our package was used to perform PCA on 55 324 subjects from the 'Gene-Environment Association Studies' consortium studies. Availability and implementation: gdsfmt and SNPRelate are available from R CRAN (http://cran.r-project.org), including a vignette. A tutorial can be found at https://www.genevastudy.org/Accomplishments/software. CONTACT: zhengx@u.washington.edu.

A high-performance computing toolset for relatedness and principal component analysis of SNP data

Journal

BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A high-performance computing toolset for relatedness and principal component analysis of SNP data

Journal

BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper