4.7 Article

An enhanced guided LDA model augmented with BERT based semantic strength for aspect term extraction in sentiment analysis

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 246, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2022.108668

Keywords

Sentiment analysis; Aspect term extraction; Guided LDA; BERT; Semantic similarity

Ask authors/readers for more resources

Aspect level sentiment analysis is a fine-grained task that extracts aspects and their sentiment polarity from opinionated text. This research proposes an unsupervised model that uses minimal aspect seed words to guide the extraction process and enhance the performance. The model incorporates guided inputs, multiple pruning strategies, and semantic filters to improve performance. Evaluation results show competitive and appreciable performance on restaurant domain datasets.
Aspect level sentiment analysis is a fine-grained task in sentiment analysis. It extracts aspects and their corresponding sentiment polarity from opinionated text. The first subtask of identifying the opinionated aspects is called aspect extraction, which is the focus of the work. Social media platforms are an enormous resource of unlabeled data. However, data annotation for fine-grained tasks is quite expensive and laborious. Hence unsupervised models would be highly appreciated. The proposed model is an unsupervised approach for aspect term extraction, a guided Latent Dirichlet Allocation (LDA) model that uses minimal aspect seed words from each aspect category to guide the model in identifying the hidden topics of interest to the user. The guided LDA model is enhanced by guiding inputs using regular expressions based on linguistic rules. The model is further enhanced by multiple pruning strategies, including a BERT based semantic filter, which incorporates semantics to strengthen situations where co-occurrence statistics might fail to serve as a differentiator. The thresholds for these semantic filters have been estimated using Particle Swarm Optimization strategy. The proposed model is expected to overcome the disadvantage of basic LDA models that fail to differentiate the overlapping topics that represent each aspect category. The work has been evaluated on the restaurant domain of SemEval 2014, 2015 and 2016 datasets and has reported an F-measure of 0.81, 0.74 and 0.75 respectively, which is competitive in comparison to the state of art unsupervised baselines and appreciable even with respect to the supervised baselines. (C)& nbsp;2022 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available