4.4 Article

Designing optimal training sets for genomic prediction using adversarial validation with probit regression

Journal

PLANT BREEDING
Volume -, Issue -, Pages -

Publisher

WILEY
DOI: 10.1111/pbr.13124

Keywords

adversarial validation; genomic prediction; mismatch in distributions; optimal training set selection; plant breeding; probit regression

Ask authors/readers for more resources

Genomic selection is revolutionizing animal and plant breeding, but its implementation faces challenges due to mismatch in training and testing set distributions. This research used the adversarial validation method with probit regression to address the distribution mismatch and select optimal training sets. Evaluations showed that the proposed method effectively detected the mismatch and outperformed existing methods, achieving higher prediction accuracy.
Genomic selection (GS) is a disruptive methodology that is revolutionizing animal and plant breeding. However, its practical implementation is challenging since many times there is a mismatch in the distribution of the training and testing sets. Adversarial validation is an approach popular in machine learning to detect and address the difference between the training and testing distributions. For this reason, the adversarial validation method in this research was implemented using probit regression to detect the mismatch in distributions and also to select an optimal training set. We evaluated the proposed method with 14 datasets, and the results were benchmarked regarding of using the whole reference population and simple random samples. We found that the proposed method is effective for detecting the mismatch in distributions and outperformed in prediction accuracy by 11.67% (in terms of mean square error) and by 5.35% (in terms of normalized mean square error) when the whole reference population was used as training sets. Also, in general, this outperformed some existing methods for optimal training designs in the context of GS.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available