4.7 Review

An unsupervised topic-sentiment joint probabilistic model for detecting deceptive reviews

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 114, Issue -, Pages 210-223

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2018.07.005

Keywords

Deceptive review detection; Topic-sentiment joint probabilistic model; Latent dirichlet allocation; Gibbs sampling

Funding

  1. Natural Science Foundation of China [71772107, 71403151, 61502281, 61433012]
  2. Key R&D Plan of Shandong Province [2018GGX101045]
  3. Natural Science Foundation of Shandong Province [ZR2018BF013, ZR2013FM023, ZR2014FP011]
  4. Shandong Education Quality Improvement Plan for Postgraduate
  5. China's Post-doctoral Science Fund [2014M561948]
  6. Postdoctoral innovation project special funds of Shandong Province [201403007]
  7. Applied research project for Qingdao postdoctoral researcher
  8. Project of Shandong Province Higher Educational Science and Technology Program [J14LN33]
  9. Leading talent development program of Shandong University of Science and Technology
  10. Special funding for Taishan scholar construction project

Ask authors/readers for more resources

In electronic commerce, online reviews play very important roles in customers' purchasing decisions. Unfortunately, malicious sellers often hire buyers to fabricate fake reviews to improve their reputation. In order to detect deceptive reviews and mine the topics and sentiments from the reviews, in this paper, we propose an unsupervised topic-sentiment joint probabilistic model (UTSJ) based on Latent Dirichlet Allocation (LDA) model. This model first employs Gibbs sampling algorithm to approximate parameters of maximum likelihood function offline and obtain topic-sentiment joint probabilistic distribution vector for each review. Secondly, a Random Forest classifier and a SVM (Support Vector Machine) classifier are trained offline, respectively. Experimental results on real-life datasets show that our proposed model is better than baseline models such as n-grams, character n-grams in token, POS (part-of-speech), LDA, and JST (Joint Sentiment/Topic). Moreover, our UTSJ model outperforms or performs similarly to benchmark models in detecting deceptive reviews over balanced dataset and unbalanced dataset in different domains. Particularly, our UTSJ model is good at dealing with real-life unbalanced big data, which makes it very suitable for being applied in e-commerce environment. (C) 2018 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available