Journal
BIOINFORMATICS
Volume 22, Issue 20, Pages 2558-2561Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btl420
Keywords
-
Categories
Funding
- NIGMS NIH HHS [1 P20 GM065762-01A1] Funding Source: Medline
Ask authors/readers for more resources
The search for the association between complex diseases and single nucleotide polymorphisms (SNPs) or haplotypes has recently received great attention. For these studies, it is essential to use a small subset of informative SNPs accurately representing the rest of the SNPs. Informative SNP selection can achieve (1) considerable budget savings by genotyping only a limited number of SNPs and computationally inferring all other SNPs or (2) necessary reduction of the huge SNP sets (obtained, e.g. from Affymetrix) for further fine haplotype analysis. A novel informative SNP selection method for unphased genotype data based on multiple linear regression (MLR) is implemented in the software package MLR-tagging. This software can be used for informative SNP (tag) selection and genotype prediction. The stepwise tag selection algorithm (STSA) selects positions of the given number of informative SNPs based on a genotype sample population. The MLR SNP prediction algorithm predicts a complete genotype based on the values of its informative SNPs, their positions among all SNPs, and a sample of complete genotypes. An extensive experimental study on various datasets including 10 regions from HapMap shows that the MLR prediction combined with stepwise tag selection uses fewer tags than the state-of-the-art method of Halperin et al. (2005).
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available