4.3 Article

Ant colony optimization for text feature selection in sentiment analysis

Journal

INTELLIGENT DATA ANALYSIS
Volume 23, Issue 1, Pages 133-158

Publisher

IOS PRESS
DOI: 10.3233/IDA-173740

Keywords

Sentiment analysis; metaheuristic algorithm; ant colony optimization; k-nearest neighbour; text feature selection

Funding

  1. Universiti Pertahanan Nasional Malaysia
  2. Ministry of Education Malaysia
  3. Fundamental Research Grant Scheme [FRGS/1/2016/ICT02/UKM/01/2]

Ask authors/readers for more resources

In sentiment analysis, the high dimensionality of the feature vector is a key problem because it can decrease the accuracy of sentiment classification and make it difficult to obtain the optimum subset of features. To solve this problem, this study proposes a new text feature selection method that uses a wrapper approach, integrated with ant colony optimization (ACO) to guide the feature selection process. It also uses the k-nearest neighbour (KNN) as a classifier to evaluate and generate a candidate subset of optimum features. To test the subset of optimum features, algorithm dependency relations were used to find the relationship between the feature and the sentiment word in customer reviews. The output of the feature subset, which was derived using the proposed ACO-KNN algorithm, was used as an input to identify and extract sentiment words from sentences in customer reviews. The resulting relationship between features and sentiment words was tested and evaluated to determine the accuracy based on precision, recall, and F-score. The performance of the proposed ACO-KNN algorithm on customer review datasets was evaluated and compared with that of two hybrid algorithms from the literature, namely, the genetic algorithm with information gain and information gain with rough set attribute reduction. The results of the experiments showed that the proposed ACO-KNN algorithm was able to obtain the optimum subset of features and can improve the accuracy of sentiment classification.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available