4.7 Article

Ensemble feature selection in high dimension, low sample size datasets: Parallel and serial combination approaches

期刊

KNOWLEDGE-BASED SYSTEMS
卷 203, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2020.106097

关键词

Data mining; Ensemble learning; Feature selection; High dimension low sample size; Machine learning

向作者/读者索取更多资源

Feature selection in high dimension, low sample size (HDLSS) data is always an important data pre-processing task. In the literature, the concept of ensemble learning has been applied to improve single feature selection methods, the so-called ensemble feature selection techniques. The most widely used approach is to combine multiple feature selection methods and their selection results via some sort of aggregation function in a parallel manner. Another ensemble strategy is based on the serial combination approach where the selection results of the first feature selection stage are used as input for the second stage of feature selection to produce the final output. The aim of this paper is to fully explore the performance of parallel and serial combination approaches for ensemble feature selection over HDLSS data. In particular, we strive to answer two research questions: whether parallel and serial based ensemble feature selection can outperform single feature selection and which combination approach is the better choice for ensemble feature selection. The experimental results based on comparing nine parallel and nine serial combinations, as well as three single baseline feature selection methods, including principal component analysis (PCA), genetic algorithm (GA), and C4.5 decision tree, show that ensemble feature selection performs better than single feature selection in terms of classification accuracy. However, there are no significant differences in performance between the single best baseline method (i.e. GA) and the top three parallel and serial combinations. On the other hand, the serial combination approach produces the largest feature reduction rate. (C) 2020 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据