3.8 Proceedings Paper

Opinion Spam Detection in Web Forum: A Real Case Study

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/2736277.2741085

Keywords

Fake Web Review; Opinion Spam Detection; Web Forum

Funding

  1. National Taiwan University [NTU-ERP-103R890858, NTU-ERP-104R890858]
  2. Ministry of Science and Technology, Taiwan [MOST 102-2221-E-002-103-MY3]

Ask authors/readers for more resources

Opinion spamming refers to the illegal marketing practice which involves delivering commercially advantageous opinions as regular users. In this paper, we conduct a real case study based on a set of internal records of opinion spams leaked from a shady marketing campaign. We explore the characteristics of opinion spams and spammers in a web forum to obtain some insights, including subtlety property of opinion spams, spam post ratio, spammer accounts, first post and replies, submission time of posts, activeness of threads, and collusion among spammers. Then we present features that could be potentially helpful in detecting spam opinions in threads. The results of spam detection on first posts show: (1) spam first posts put more focus on certain topics such as the user experiences' on the promoted items, (2) spam first posts generally use more words and pictures to showcase the promoted items in an attempt to impress people, (3) spam first posts tend to be submitted during work time, and (4) the threads that spam first posts initiate are more active to be placed at striking positions. The spam detection on replies is more challenging. Besides lower spam ratio and less content, replies even do not mention the promoted items. Their major intention is to keep the discussion in a thread alive to attract more attention on it. Submission time of replies, thread activeness, position of replies, and spamicity of first post are more useful than content-based features in spam detection on replies.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available