4.0 Article

A Highly Adaptive Oversampling Approach to Address the Issue of Data Imbalance

期刊

COMPUTERS
卷 11, 期 5, 页码 -

出版社

MDPI
DOI: 10.3390/computers11050073

关键词

imbalanced learning; oversampling; optimized oversampling; adaptive sampling

资金

  1. European Union
  2. European Social Fund
  3. [EFOP-3.6.3-VEKOP-16-2017-00002]

向作者/读者索取更多资源

In this paper, a new oversampling method is proposed and optimized to adapt to different datasets. Experimental results demonstrate its superior performance compared to other well-known samplers on various classifiers.
Data imbalance is a serious problem in machine learning that can be alleviated at the data level by balancing the class distribution with sampling. In the last decade, several sampling methods have been published to address the shortcomings of the initial ones, such as noise sensitivity and incorrect neighbor selection. Based on the review of the literature, it has become clear to us that the algorithms achieve varying performance on different data sets. In this paper, we present a new oversampler that has been developed based on the key steps and sampling strategies identified by analyzing dozens of existing methods and that can be fitted to various data sets through an optimization process. Experiments were performed on a number of data sets, which show that the proposed method had a similar or better effect on the performance of SVM, DTree, kNN and MLP classifiers compared with other well-known samplers found in the literature. The results were also confirmed by statistical tests.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据