4.7 Article

A GAN-based hybrid sampling method for imbalanced customer classification

期刊

INFORMATION SCIENCES
卷 609, 期 -, 页码 1397-1411

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2022.07.145

关键词

Customer classification; Class imbalance; Class overlap; GAN-based sampling

资金

  1. Major Project of the National Social Science Foundation of China [18VZL006]
  2. Funds of Sichuan University [skqy 201742, 2019ZY-Business04]

向作者/读者索取更多资源

Class imbalance is a critical issue in customer classification, and various techniques have been proposed to address this problem. This study introduces a novel GAN-based hybrid sampling method that effectively tackles class imbalance and demonstrates superior performance in experiments.
Class imbalance is a critical issue in customer classification, for which a plethora of tech-niques have been proposed in the current body of literature. In particular, generative adversarial network (GAN)-based oversampling can capture the true data distribution of minority class samples and generate new samples, and this approach has demonstrated an outstanding ability to address class imbalance. However, GAN-based oversampling suf-fers from the issue of class overlap. As a result, in this work, we propose a novel a novel GAN-based hybrid sampling method. The new approach first uses GAN-based oversam-pling to generate the initial balanced dataset and then applies a novel adaptive neighborhood-based weighted undersampling method to remove generated instances and original majority class instances. This approach not only produces instances that fit the actual data distribution but also significantly reduces the influence of class overlap. Experimental results on artificial data and real-world customer datasets show that the pro-posed GAN-based hybrid sampling method has better performance than other benchmark methods with both accuracy-based and profit-based evaluation metrics.(c) 2022 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据