4.5 Article

Adaptively weighted three-way decision oversampling: A cluster imbalanced-ratio based approach

期刊

APPLIED INTELLIGENCE
卷 53, 期 1, 页码 312-335

出版社

SPRINGER
DOI: 10.1007/s10489-022-03394-7

关键词

Machine learning; Imbalanced learning; Binary classification; Oversampling

向作者/读者索取更多资源

In this paper, a new improved oversampling method called adaptively weighted three-way decision oversampling (AWTDO) is proposed for imbalanced learning. The method involves removing noise samples, clustering, categorizing clusters based on imbalance ratios, and generating synthetic samples accordingly. Experimental results show that AWTDO outperforms other methods.
Oversampling is an effective method to fulfill imbalanced learning, owing to its easy-to-go capability of achieving the balance by synthesizing new samples. However, precise synthesizing in oversampling is always a significant yet challenging task due primarily to various problems such as noise samples, within-class imbalance, and selection of boundary samples. In order to solve these problems, this paper proposes a new improved oversampling method, called adaptively weighted three-way decision oversampling (AWTDO) for imbalanced learning. The working principle of the proposed AWTDO method includes three main steps. Firstly, remove the noise sample roughly, implement K-means clustering algorithm on raw data to establish multi-clusters, and calculate imbalanced ratio of each cluster. Secondly, classify all clusters into three categories according to their imbalanced ratios and three-way decision, such as positive domain, boundary domain, and negative domain. Accordingly, assign the number of synthetic samples distinguishably to each cluster regarding its category. Thirdly, determinatively select the target minority sample in each cluster and generate the new synthetic samples by using the stochastic linear interpolation technique according to different sampling weight. Finally, some comparative experiments on public datasets have shown that the proposed AWTDO method outperforms nine state-of-the-art oversampling methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据