Journal
KNOWLEDGE-BASED SYSTEMS
Volume 246, Issue -, Pages -Publisher
ELSEVIER
DOI: 10.1016/j.knosys.2022.108668
Keywords
Sentiment analysis; Aspect term extraction; Guided LDA; BERT; Semantic similarity
Categories
Ask authors/readers for more resources
Aspect level sentiment analysis is a fine-grained task that extracts aspects and their sentiment polarity from opinionated text. This research proposes an unsupervised model that uses minimal aspect seed words to guide the extraction process and enhance the performance. The model incorporates guided inputs, multiple pruning strategies, and semantic filters to improve performance. Evaluation results show competitive and appreciable performance on restaurant domain datasets.
Aspect level sentiment analysis is a fine-grained task in sentiment analysis. It extracts aspects and their corresponding sentiment polarity from opinionated text. The first subtask of identifying the opinionated aspects is called aspect extraction, which is the focus of the work. Social media platforms are an enormous resource of unlabeled data. However, data annotation for fine-grained tasks is quite expensive and laborious. Hence unsupervised models would be highly appreciated. The proposed model is an unsupervised approach for aspect term extraction, a guided Latent Dirichlet Allocation (LDA) model that uses minimal aspect seed words from each aspect category to guide the model in identifying the hidden topics of interest to the user. The guided LDA model is enhanced by guiding inputs using regular expressions based on linguistic rules. The model is further enhanced by multiple pruning strategies, including a BERT based semantic filter, which incorporates semantics to strengthen situations where co-occurrence statistics might fail to serve as a differentiator. The thresholds for these semantic filters have been estimated using Particle Swarm Optimization strategy. The proposed model is expected to overcome the disadvantage of basic LDA models that fail to differentiate the overlapping topics that represent each aspect category. The work has been evaluated on the restaurant domain of SemEval 2014, 2015 and 2016 datasets and has reported an F-measure of 0.81, 0.74 and 0.75 respectively, which is competitive in comparison to the state of art unsupervised baselines and appreciable even with respect to the supervised baselines. (C)& nbsp;2022 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available