☆ 4.6 Article

Ensemble feature selection for single-label text classification: a comprehensive analytical study

NEURAL COMPUTING & APPLICATIONS (2023)

期刊

NEURAL COMPUTING & APPLICATIONS

卷 35, 期 26, 页码 19235-19251

出版社

SPRINGER LONDON LTD

DOI: 10.1007/s00521-023-08763-y

关键词

Text classification; Feature selection; Global; Local; Ensemble feature subsets

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Text classification is a crucial problem in the modern era due to the large amount of textual data. Feature selection, which has a big impact on classification accuracy, is one of the most crucial processes in text classification studies. Various feature selection techniques are suggested in the literature, each with a different feature order and selection criteria. This study aims to combine these distinguishing features in different orders to observe the success and failure of different methods when combined. The results show that the combination of feature selection approaches performs better than any single feature selection method alone, but some combinations may have lower performance rates than individual methods.

Due to the large amount of textual data, text classification is a crucial problem in the modern era. In text classification studies, feature selection is one of the most crucial processes because it has a big impact on classification accuracy. Many feature selection techniques are suggested in the field of text classification in the literature. Each method sorts the features by assigning a score according to its algorithm. Then, the classification process is performed by selecting top-N features. However, the feature order for each method is different from each other. Each method selects by assigning a high score to the features that are important according to its algorithm, while it does not select by assigning a low score to the insignificant features. However, each method selects different distinguishing features according to its algorithm. With combinations of these distinguishing features, a higher performance classification process can be achieved. So, the classification process is to combine the features in a different order according to each method in this study. Thus, it will be observed which methods are successful or unsuccessful when combined. In addition, it was observed that the methods chose how many different features from each other. Accordingly, the classification is made by combining the features of different sizes and combining two local and two global feature selection methods. Numerous studies using three benchmark datasets have shown that the combination of feature selection approaches performs better than any single feature selection method used alone. However, some combinations have lower performance rates than individual methods. Thus, a comprehensive study was carried out in text classification domain.

Ensemble feature selection for single-label text classification: a comprehensive analytical study

期刊

NEURAL COMPUTING & APPLICATIONS

出版社

SPRINGER LONDON LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Ensemble feature selection for single-label text classification: a comprehensive analytical study

期刊

NEURAL COMPUTING & APPLICATIONS

出版社

SPRINGER LONDON LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文