4.5 Article

Support vector regression that takes into consideration the importance of explanatory variables

Journal

JOURNAL OF CHEMOMETRICS
Volume 35, Issue 4, Pages -

Publisher

WILEY
DOI: 10.1002/cem.3327

Keywords

predictive performance; QSAR; QSPR; support vector regression; variable importance

Funding

  1. JSPS KAKENHI [JP19K15352]

Ask authors/readers for more resources

Support vector regression (SVR) can capture the nonlinear relationship between explanatory variables X and a target variable y, leading to high predictive accuracy. The novel VI-SVR method, which considers variable importance, outperforms traditional SVR in predictive accuracy.
Support vector regression (SVR) is able to consider the nonlinear relationship between explanatory variables X and a target variable y to build a regression model with high predictive accuracy. Additionally, y values predicted with SVR models for new samples can exceed the actual y values in training data. However, because the Gaussian kernel, which is a kernel function generally used in SVR, is based on the Euclidean distance between samples, it is unable to consider the importance of X when building the regression model. Therefore, in this study, the focus was on the importance of X that can be calculated by random forests (RF), and a novel SVR method, called variable importance-considering support vector regression (VI-SVR), was proposed based on this importance. Because X is weighted based on importance, the greater the importance of X, the greater its contribution to the predicted value. Analysis using the spectral, quantitative structure-property relationship (QSPR), and quantitative structure-activity relationship (QSAR) datasets confirmed that the predictive accuracy of VI-SVR was better than that of SVR. VI-SVR Python code is available at .

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available