4.6 Article

A General-Purpose Machine Learning R Library for Sparse Kernels Methods With an Application for Genome-Based Prediction

Journal

FRONTIERS IN GENETICS
Volume 13, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fgene.2022.887643

Keywords

r package; machine learning; kernel; supervised learning; sparse kernels; genome-base prediction

Funding

  1. Bill & Melinda Gates Foundation [9 MTO 069033]
  2. USAID [INV-003439]
  3. AGG-Maize Supplementary Project
  4. AGG (Stress Tolerant Maize for Africa)
  5. CIMMYT CRP (maize and wheat)

Ask authors/readers for more resources

This paper presents a new software package (SKM) for implementing six popular supervised machine learning algorithms with the optional use of sparse kernels, as well as a function for computing seven different kernels. SKM focuses on user simplicity and computational efficiency, providing a user-friendly format for algorithms and reducing resources needed for kernel machine learning methods.
The adoption of machine learning frameworks in areas beyond computer science have been facilitated by the development of user-friendly software tools that do not require an advanced understanding of computer programming. In this paper, we present a new package (sparse kernel methods, SKM) software developed in R language for implementing six (generalized boosted machines, generalized linear models, support vector machines, random forest, Bayesian regression models and deep neural networks) of the most popular supervised machine learning algorithms with the optional use of sparse kernels. The SKM focuses on user simplicity, as it does not try to include all the available machine learning algorithms, but rather the most important aspects of these six algorithms in an easy-to-understand format. Another relevant contribution of this package is a function for the computation of seven different kernels. These are Linear, Polynomial, Sigmoid, Gaussian, Exponential, Arc-Cosine 1 and Arc-Cosine L (with L = 2, 3, horizontal ellipsis ) and their sparse versions, which allow users to create kernel machines without modifying the statistical machine learning algorithm. It is important to point out that the main contribution of our package resides in the functionality for the computation of the sparse version of seven basic kernels, which is indispensable for reducing computational resources to implement kernel machine learning methods without a significant loss in prediction performance. Performance of the SKM is evaluated in a genome-based prediction framework using both a maize and wheat data set. As such, the use of this package is not restricted to genome prediction problems, and can be used in many different applications.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available