Journal
JOURNAL OF CHEMOMETRICS
Volume 35, Issue 4, Pages -Publisher
WILEY
DOI: 10.1002/cem.3327
Keywords
predictive performance; QSAR; QSPR; support vector regression; variable importance
Categories
Funding
- JSPS KAKENHI [JP19K15352]
Ask authors/readers for more resources
Support vector regression (SVR) can capture the nonlinear relationship between explanatory variables X and a target variable y, leading to high predictive accuracy. The novel VI-SVR method, which considers variable importance, outperforms traditional SVR in predictive accuracy.
Support vector regression (SVR) is able to consider the nonlinear relationship between explanatory variables X and a target variable y to build a regression model with high predictive accuracy. Additionally, y values predicted with SVR models for new samples can exceed the actual y values in training data. However, because the Gaussian kernel, which is a kernel function generally used in SVR, is based on the Euclidean distance between samples, it is unable to consider the importance of X when building the regression model. Therefore, in this study, the focus was on the importance of X that can be calculated by random forests (RF), and a novel SVR method, called variable importance-considering support vector regression (VI-SVR), was proposed based on this importance. Because X is weighted based on importance, the greater the importance of X, the greater its contribution to the predicted value. Analysis using the spectral, quantitative structure-property relationship (QSPR), and quantitative structure-activity relationship (QSAR) datasets confirmed that the predictive accuracy of VI-SVR was better than that of SVR. VI-SVR Python code is available at .
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available