4.6 Review

Fake Reviews Detection: A Survey

Journal

IEEE ACCESS
Volume 9, Issue -, Pages 65771-65802

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3075573

Keywords

Feature extraction; Task analysis; Social networking (online); Deep learning; Companies; Licenses; Portable computers; Fake review; fake review detection; feature engineering; machine learning; deep learning

Ask authors/readers for more resources

User reviews are crucial in e-commerce but fake reviews have become a significant issue. Investigating existing datasets, feature extraction techniques, and identification methods can help businesses detect fake reviews and improve profitability.
In e-commerce, user reviews can play a significant role in determining the revenue of an organisation. Online users rely on reviews before making decisions about any product and service. As such, the credibility of online reviews is crucial for businesses and can directly affect companies' reputation and profitability. That is why some businesses are paying spammers to post fake reviews. These fake reviews exploit consumer purchasing decisions. Consequently, the techniques for detecting fake reviews have extensively been explored in the past twelve years. However, there still lacks a survey that can analyse and summarise the existing approaches. To bridge up the issue, this survey paper details the task of fake review detection, summing up the existing datasets and their collection methods. It analyses the existing feature extraction techniques. It also summarises and analyses the existing techniques critically to identify gaps based on two groups: traditional statistical machine learning and deep learning methods. Further, we conduct a benchmark study to investigate the performance of different neural network models and transformers that have not been used for fake review detection yet. The experimental results on two benchmark datasets show that RoBERTa performs about 7% better than the state-of-the-art methods in a mixed domain for the deception dataset with the highest accuracy of 91.2%, which can be used as a baseline for future studies. Finally, we highlight the current gaps in this research area and the possible future directions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available