4.1 Article

Effect of minor allele frequency and density of single nucleotide polymorphism marker arrays on imputation performance and prediction ability using the single-step genomic Best Linear Unbiased Prediction in a simulated beef cattle population

期刊

ANIMAL PRODUCTION SCIENCE
卷 63, 期 9, 页码 844-852

出版社

CSIRO PUBLISHING
DOI: 10.1071/AN21581

关键词

bias; bovine; customised SNP arrays; genomic selection; imputation accuracy; inflation; MAF; simulation

向作者/读者索取更多资源

This study evaluated the impact of SNP marker density and MAF on genomic predictions and imputation performance for beef cattle using ssGBLUP methodology. The results showed that at least 5% of SNP markers from a high-density array are necessary to obtain reliable genomic predictions and imputed genotypes. The findings suggest that developing low-density customised arrays based on MAF and even distribution of SNPs could be a cost-effective approach for implementing genomic selection in beef cattle.
Context. In beef cattle populations, there is little evidence regarding the minimum number of genetic markers needed to obtain reliable genomic prediction and imputed genotypes. Aims. This study aimed to evaluate the impact of single nucleotide polymorphism (SNP) marker density and minor allele frequency (MAF), on genomic predictions and imputation performance for high and de low heritability traits using the single-step genomic Best Linear Unbiased Prediction methodology (ssGBLUP) in a simulated beef cattle population. Methods. The simulated genomic and phenotypic data were obtained through QMsim software. 735 293 SNPs markers and 7000 quantitative trait loci (QTL) were randomly simulated. The mutation rate (10(-5)), QTL effects distribution (gamma distribution with shape parameter = 0.4) and minor allele frequency (MAF >= 0.02) of markers were used for quality control. A total of 335k SNPs (high density, HD) and 1000 QTLs were finally considered. Densities of 33 500 (35k), 16 750 (16k), 4186 (4k) and 2093 (2k) SNPs were customised through windows of 10, 20, 80 and 160 SNPs by chromosome, respectively. Three marker selection criteria were used within windows: (1) informative markers with MAF values close to 0.5 (HI); (2) less informative markers with the lowest MAF values (LI); (3) markers evenly distributed (ED). We evaluated the prediction of the high-density array and of 12 scenarios of customised SNP arrays, further the imputation performance of them. The genomic predictions and imputed genotypes were obtained with Blupf90 and FImpute software, respectively, and statistics parameters were applied to evaluate the accuracy of genotypes imputed. The Pearson'scorrelation,thecoefficient of regression, and the difference between genomic predictions and true breeding values were used to evaluate the prediction ability (PA), inflation (b), and bias (d), respectively. Key results. Densities above 16k SNPs using HI and ED criteria displayed lower b, higher PA and higher imputation accuracy. Consequently, similar values of PA, b and d were observed with the use of imputed genotypes. The LI criterion with densities higher than 35k SNPs, showed higher PA and similar predictions using imputed genotypes, however lower b and quality of imputed genotypes were observed. Conclusion. The results obtained showed that at least 5% of HI or ED SNPs available in the HD array are necessary to obtain reliable genomic predictions and imputed genotypes. Implications. The development of low-density customised arrays based on criteria of MAF and even distribution of SNPs, might be a cost-effective and feasible approach to implement genomic selection in beef cattle.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据