期刊
IEEE ACCESS
卷 10, 期 -, 页码 116120-116128出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3219582
关键词
Classification algorithms; Clustering algorithms; Probability; Ensemble learning; Partitioning algorithms; Machine learning algorithms; Decision trees; Imbalanced data; density peaks clustering; fitness; under-sampling; classification
资金
- National Natural Science Foundation of China [62172351]
This paper proposes an algorithm based on density peaks clustering and fitness to address the low classification accuracy of the minority class in imbalanced data. Experimental results show that the algorithm outperforms other algorithms.
In view of the low classification accuracy of the minority class in imbalanced data, an algorithm called DPF-EL (density peaks and fitness combined with ensemble learning) based on density peaks clustering and fitness is proposed. Firstly, this method uses the density peaks clustering algorithm to divide the majority class into different sub-clusters, the local density calculated in the clustering process is used to assign weights to each sub-cluster, and the number of under-sampling is determined by the weights. Secondly, the concept of fitness is introduced into the sub-clusters, the selection probability of the samples is calculated according to the size of their fitness, and the majority class is under-sampled based on the selection probability. Finally, combined with boosting algorithm, iterative training is performed on the balanced data set. Experimental tests were conducted with KEEL imbalanced data sets, and the experimental results show that the performance of DPF-EL algorithm is better than other algorithms, which indicates the feasibility of the proposed algorithm.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据