4.2 Article

Sentiment lexicon construction for Chinese book reviews based on ultrashort reviews

期刊

ELECTRONIC LIBRARY
卷 40, 期 3, 页码 221-236

出版社

EMERALD GROUP PUBLISHING LTD
DOI: 10.1108/EL-07-2021-0147

关键词

Sentiment lexicon; Sentiment analysis; Corpus-based; Chinese book reviews; China; Book reviews

资金

  1. General Project of National Social Science Fund Research on Automatic Construction and Long-Term Evolution of Intangible Cultural Heritage Knowledge Graph from the Perspective of Digital Humanities [20BTQ071]

向作者/读者索取更多资源

This paper presents a method of constructing a sentiment lexicon based on ultrashort reviews, successfully building one for Chinese books. The performance of the sentiment lexicon is evaluated through experiments, demonstrating its effectiveness.
Purpose Sentiment lexicon is an essential resource for sentiment analysis of user reviews. By far, there is still a lack of domain sentiment lexicon with large scale and high accuracy for Chinese book reviews. This paper aims to construct a large-scale sentiment lexicon based on the ultrashort reviews of Chinese books. Design/methodology/approach First, large-scale ultrashort reviews of Chinese books, whose length is no more than six Chinese characters, are collected and preprocessed as candidate sentiment words. Second, non-sentiment words are filtered out through certain rules, such as part of speech rules, context rules, feature word rules and user behaviour rules. Third, the relative frequency is used to select and judge the polarity of sentiment words. Finally, the performance of the sentiment lexicon is evaluated through experiments. Findings This paper proposes a method of sentiment lexicon construction based on ultrashort reviews and successfully builds one for Chinese books with nearly 40,000 words based on the Douban book. Originality/value Compared with the idea of constructing a sentiment lexicon based on a small number of reviews, the proposed method can give full play to the advantages of data scale to build a corpus. Moreover, different from the computer segmentation method, this method helps to avoid the problems caused by immature segmentation technology and an imperfect N-gram language model.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据