☆ 4.7 Article

Imputation-Based Ensemble Techniques for Class Imbalance Learning

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2021)

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

卷 33, 期 5, 页码 1988-2001

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TKDE.2019.2951556

关键词

Boosting; Bagging; Training; Machine learning algorithms; Measurement; Standards; Sampling methods; Class imbalance learning; oversampling; ensembles learning; missing data imputation

类别

Computer Science, Artificial Intelligence Computer Science, Information Systems Engineering, Electrical & Electronic

资金

Natural Sciences and Engineering Research Council of Canada (NSERC) [401226689]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The correct classification of rare samples is crucial and this article proposes novel oversampling strategies based on imputation methods to address this issue. The techniques are designed to generate synthetic minority class samples and outperform other methods according to performance metrics such as AUC, F-measure, and G-mean.

Correct classification of rare samples is a vital data mining task and of paramount importance in many research domains. This article mainly focuses on the development of the novel class-imbalance learning techniques, which make use of oversampling methods integrated with bagging and boosting ensembles. Two novel oversampling strategies based on the single and the multiple imputation methods are proposed. The proposed techniques aim to create useful synthetic minority class samples, similar to the original minority class samples, by estimation of missing values that are already induced in the minority class samples. The re-balanced datasets are then used to train base-learners of the ensemble algorithms. In addition, the proposed techniques are compared with the commonly used class imbalance learning methods in terms of three performance metrics including AUC, F-measure, and G-mean over several synthetic binary class datasets. The empirical results show that the proposed multiple imputation-based oversampling combined with bagging significantly outperforms other competitors.

Imputation-Based Ensemble Techniques for Class Imbalance Learning

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Imputation-Based Ensemble Techniques for Class Imbalance Learning

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文