4.7 Article

Ensemble learning-based filter-centric hybrid feature selection framework for high-dimensional imbalanced data

期刊

KNOWLEDGE-BASED SYSTEMS
卷 220, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2021.106901

关键词

Hybrid feature selection; Ensemble feature selection; Multiple classifiers; Robust feature subset; High-dimensional imbalanced data

资金

  1. Basic Science Research Program through the National Research Foundation of Korea (NRF) - Ministry of Education, Science and Technology, South Korea [NRF-2016 R1D1A1B03932110]

向作者/读者索取更多资源

Research on feature selection for high-dimensional imbalanced data has been a focus of attention. A hybrid method that combines filter and ensemble learning is proposed to select the best feature subset.
In recent years, research on feature selection for high-dimensional imbalanced data has attracted a considerable amount of attention. The filter-wrapper hybrid method, which is a conventional method of feature selection for high-dimensional data, aims to reduce excessive computational time. On the other hand, ensemble learning-based feature selection, even though it has a high level of computational complexity, focuses exclusively on the discovery of robust features. From this perspective, combining these two feature selection methods is not easy. However, a combined method is essential to advancing machine learning research that addresses real-world problems. We propose an filter-centric hybrid method based on ensemble-learning that can select the best feature subset for high-dimensional imbalanced data. The basic concept of the proposed method is to design a feature evaluation scheme based on the filter method and to apply ensemble learning with reasonable computational time. To achieve this objective, our innovative method utilizes predictions produced by multiple classifiers as inputs of the feature evaluation function. As a result, it can reflect the predictive performance of the classifiers and overcome the low performance of selected features by filter methods. In addition, it can find robust features simultaneously. To demonstrate the superiority of the proposed method, we perform various experiments using 14 experimental datasets that consist of low-dimensional balanced, high-dimensional balanced, and high-dimensional imbalanced datasets. Finally, we compare the proposed method with state-of-the-art feature selection methods. (c) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据