4.7 Article

An enhanced guided LDA model augmented with BERT based semantic strength for aspect term extraction in sentiment analysis

期刊

KNOWLEDGE-BASED SYSTEMS
卷 246, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2022.108668

关键词

Sentiment analysis; Aspect term extraction; Guided LDA; BERT; Semantic similarity

向作者/读者索取更多资源

Aspect level sentiment analysis is a fine-grained task that extracts aspects and their sentiment polarity from opinionated text. This research proposes an unsupervised model that uses minimal aspect seed words to guide the extraction process and enhance the performance. The model incorporates guided inputs, multiple pruning strategies, and semantic filters to improve performance. Evaluation results show competitive and appreciable performance on restaurant domain datasets.
Aspect level sentiment analysis is a fine-grained task in sentiment analysis. It extracts aspects and their corresponding sentiment polarity from opinionated text. The first subtask of identifying the opinionated aspects is called aspect extraction, which is the focus of the work. Social media platforms are an enormous resource of unlabeled data. However, data annotation for fine-grained tasks is quite expensive and laborious. Hence unsupervised models would be highly appreciated. The proposed model is an unsupervised approach for aspect term extraction, a guided Latent Dirichlet Allocation (LDA) model that uses minimal aspect seed words from each aspect category to guide the model in identifying the hidden topics of interest to the user. The guided LDA model is enhanced by guiding inputs using regular expressions based on linguistic rules. The model is further enhanced by multiple pruning strategies, including a BERT based semantic filter, which incorporates semantics to strengthen situations where co-occurrence statistics might fail to serve as a differentiator. The thresholds for these semantic filters have been estimated using Particle Swarm Optimization strategy. The proposed model is expected to overcome the disadvantage of basic LDA models that fail to differentiate the overlapping topics that represent each aspect category. The work has been evaluated on the restaurant domain of SemEval 2014, 2015 and 2016 datasets and has reported an F-measure of 0.81, 0.74 and 0.75 respectively, which is competitive in comparison to the state of art unsupervised baselines and appreciable even with respect to the supervised baselines. (C)& nbsp;2022 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据