3.8 Proceedings Paper

Particle Swarm Optimization based Two-Stage Feature Selection in Text Mining

Journal

Publisher

IEEE
DOI: 10.1109/CEC.2018.8477773

Keywords

Feature selection; text mining; particle swarm optimization; two-stage method

Ask authors/readers for more resources

Text mining is an important and popular data mining topic, where a fundamental objective is to enable users to extract informative data from text-based assets and perform related operations on the text, like retrieval, classification, and summarization. For text classification, one of the most important steps is feature selection, because not all the features in the text dataset are useful for classification. Irrelevant and redundant features should be removed to increase the accuracy and decrease the complexity and running time, but it is often an expensive process, and most existing methods using a simple filter to remove features, which might potentially loose some useful ones because of feature interactions. Furthermore, there is little research using particle swarm optimization ( PSO) algorithms to select informative features for text classification. This paper presents an approach using a novel two-stage method for text feature selection, where with the features selected by four different filter ranking methods at the first stage, more irrelevant features are removed by PSO to compose the final feature subset. The proposed algorithm is compared with four traditional feature selection methods on the commonly used Reuter-21578 dataset. The experimental results show that the proposed two-stage method can substantially reduce the dimensionality of the feature space and improve the classification accuracy.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available