☆ 4.6 Article

A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library

GENES (2022)

期刊

GENES

卷 13, 期 8, 页码 -

出版社

MDPI

DOI: 10.3390/genes13081494

关键词

multi-trait; statistical machine learning; genomic selection; plant breeding; multi-environment

类别

Genetics & Heredity

资金

Bill & Melinda Gates Foundation [INV-003439]
USAID projects USAID Amend [9 MTO 069033]
USAID-CIMMYT Wheat/AGGMW
AGG-Maize Supplementary Project
AGG (Stress Tolerant Maize for Africa)
CIMMYT CRP (maize and wheat)
Foundation for Research Levy on Agricultural Products (FFL)
Agricultural Agreement Research Fund (JA) in Norway through NFR grant [267806]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Genomic selection has revolutionized the way plant breeders select genotypes, using statistical machine learning models to predict phenotypic values of new lines. Multi-trait genomic prediction models leverage correlated traits to improve accuracy. This paper compares the performance of three multi-trait methods and finds that their performance varies under different predictors.

Genomic selection (GS) changed the way plant breeders select genotypes. GS takes advantage of phenotypic and genotypic information to training a statistical machine learning model, which is used to predict phenotypic (or breeding) values of new lines for which only genotypic information is available. Therefore, many statistical machine learning methods have been proposed for this task. Multi-trait (MT) genomic prediction models take advantage of correlated traits to improve prediction accuracy. Therefore, some multivariate statistical machine learning methods are popular for GS. In this paper, we compare the prediction performance of three MT methods: the MT genomic best linear unbiased predictor (GBLUP), the MT partial least squares (PLS) and the multi-trait random forest (RF) methods. Benchmarking was performed with six real datasets. We found that the three investigated methods produce similar results, but under predictors with genotype (G) and environment (E), that is, E + G, the MT GBLUP achieved superior performance, whereas under predictors E + G + genotype x environment (GE) and G + GE, random forest achieved the best results. We also found that the best predictions were achieved under the predictors E + G and E + G + GE. Here, we also provide the R code for the implementation of these three statistical machine learning methods in the sparse kernel method (SKM) library, which offers not only options for single-trait prediction with various statistical machine learning methods but also some options for MT predictions that can help to capture improved complex patterns in datasets that are common in genomic selection.

A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library

期刊

GENES

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library

期刊

GENES

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文