☆ 4.7 Article

Affinity and class probability-based fuzzy support vector machine for imbalanced data sets

NEURAL NETWORKS (2020)

期刊

NEURAL NETWORKS

卷 122, 期 -, 页码 289-307

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.neunet.2019.10.016

关键词

Imbalanced data; Fuzzy support vector machine (FSVM); Affinity; Class probability; Kernelknn

类别

Computer Science, Artificial Intelligence Neurosciences

资金

Fundamental Research Funds for the Central Universities [2572017EB02, 2572017CB07]
Innovative talent fund of Harbin science and technology Bureau [2017RAXXJ018]
Double first-class scientific research foundation of Northeast Forestry University [411112438]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The learning problem from imbalanced data sets poses a major challenge in data mining community. Although conventional support vector machine can generally show relatively robust performance in dealing with the classification problems of imbalanced data sets, it treats all training samples with the same contribution for learning, which results in the final decision boundary biasing toward the majority class especially in the presence of outliers or noises. In this paper, we propose a new affinity and class probability-based fuzzy support vector machine technique (ACFSVM). The affinity of a majority class sample is calculated according to support vector description domain (SVDD) model trained only by the given majority class training samples in kernel space similar to that used for FSVM learning. The obtained affinity can be used for identifying possible outliers and some border samples existing in the majority class training samples. In order to eliminate the effect of noises, we employ the kernel k-nearest neighbor method to determine the class probability of the majority class samples in the same kernel space as before. The samples with lower class probabilities are more likely to be noises and their contribution for learning seems to be reduced by their low memberships constructed by combining the affinities and the class probabilities. Thus, ACFSVM can pay more attention to the majority class samples with higher affinities and class probabilities while reducing their effects of the ones with lower affinities and class probabilities, eventually skewing the final classification boundary toward the majority class. In addition, the minority class samples are assigned relative high memberships to guarantee their importance for the model learning. The extensive experimental results on the different imbalanced datasets from UCI repository demonstrate that the proposed approach can achieve better generalization performance in terms of G-Mean, F-Measure, and AUC as compared to the other existing imbalanced dataset classification techniques. (c) 2019 Elsevier Ltd. All rights reserved.

Affinity and class probability-based fuzzy support vector machine for imbalanced data sets

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Affinity and class probability-based fuzzy support vector machine for imbalanced data sets

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文