4.6 Article

Ensemble feature selection for single-label text classification: a comprehensive analytical study

期刊

NEURAL COMPUTING & APPLICATIONS
卷 35, 期 26, 页码 19235-19251

出版社

SPRINGER LONDON LTD
DOI: 10.1007/s00521-023-08763-y

关键词

Text classification; Feature selection; Global; Local; Ensemble feature subsets

向作者/读者索取更多资源

Text classification is a crucial problem in the modern era due to the large amount of textual data. Feature selection, which has a big impact on classification accuracy, is one of the most crucial processes in text classification studies. Various feature selection techniques are suggested in the literature, each with a different feature order and selection criteria. This study aims to combine these distinguishing features in different orders to observe the success and failure of different methods when combined. The results show that the combination of feature selection approaches performs better than any single feature selection method alone, but some combinations may have lower performance rates than individual methods.
Due to the large amount of textual data, text classification is a crucial problem in the modern era. In text classification studies, feature selection is one of the most crucial processes because it has a big impact on classification accuracy. Many feature selection techniques are suggested in the field of text classification in the literature. Each method sorts the features by assigning a score according to its algorithm. Then, the classification process is performed by selecting top-N features. However, the feature order for each method is different from each other. Each method selects by assigning a high score to the features that are important according to its algorithm, while it does not select by assigning a low score to the insignificant features. However, each method selects different distinguishing features according to its algorithm. With combinations of these distinguishing features, a higher performance classification process can be achieved. So, the classification process is to combine the features in a different order according to each method in this study. Thus, it will be observed which methods are successful or unsuccessful when combined. In addition, it was observed that the methods chose how many different features from each other. Accordingly, the classification is made by combining the features of different sizes and combining two local and two global feature selection methods. Numerous studies using three benchmark datasets have shown that the combination of feature selection approaches performs better than any single feature selection method used alone. However, some combinations have lower performance rates than individual methods. Thus, a comprehensive study was carried out in text classification domain.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据