4.7 Article

Imputation of missing single nucleotide polymorphism genotypes using a multivariate mixed model framework

Journal

JOURNAL OF ANIMAL SCIENCE
Volume 89, Issue 7, Pages 2042-2049

Publisher

OXFORD UNIV PRESS INC
DOI: 10.2527/jas.2010-3297

Keywords

best linear unbiased prediction; imputation; missing genotype

Funding

  1. NWO (Netherlands Organisation for Scientific Research)-Casimir, CRV
  2. Hendrix Genetics

Ask authors/readers for more resources

The objective of this paper was to investigate, for various scenarios at low and high marker density, the accuracy of imputing genotypes when using a multivariate mixed model framework using information from 2, 4, or 10 surrounding markers. This model predicts genotypes at a locus, using genotypes at nearby loci as correlated traits, and the additive genetic relationship matrix to use information from genotyped relatives. For 2 scenarios this method was compared with the population-based imputation algorithms Fast-PHASE and Beagle. Accuracies of imputation were obtained with Monte Carlo simulation and predicted with selection index theory, using input from the simulated data. Five different scenarios of missing genotypes were considered: 1) genotypes of some loci are missing due to genotyping errors, 2) juvenile selection candidates are genotyped using a smaller SNP panel, 3) some animals in the pedigree of a breeding population are not genotyped, 4) juvenile selection candidates are not genotyped, and 5) 1 generation of animals in the top of the pedigree are not genotyped. Surrounding marker information did not improve accuracy of imputation when animals whose genotypes were imputed were not genotyped for those surrounding markers. When those animals were genotyped for surrounding markers, results indicated a limited gain when linkage disequilibrium (LD) between SNP was low, but a substantial increase in accuracy when LD between SNP was high. For scenario 1, using 1 vs. 11 SNP, accuracy was respectively 0.75 and 0.81 at low, and 0.75 and 0.93 at high density. For scenario 2, using 1 vs. 11 SNP, accuracy was, respectively, 0.70 and 0.73 at low, and 0.71 and 0.84 at high density. Beagle outperformed the other methods at high SNP density, whereas the multivariate mixed model was clearly superior when SNP density was low and animals where genotyped with a reduced SNP panel. The results showed that extending the univariate gene content method to a multivariate BLUP model with inclusion of surrounding marker information only yields greater imputation accuracy when the animals with imputed loci are at least genotyped for some SNP that are in LD with the SNP to be imputed. The equation derived from selection index theory accurately predicted the accuracy of imputation using the multivariate mixed model framework.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available