☆ 4.7 Review

Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING (2021)

期刊

ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING

卷 28, 期 4, 页码 2543-2566

出版社

SPRINGER

DOI: 10.1007/s11831-020-09464-8

关键词

类别

Computer Science, Interdisciplinary Applications Engineering, Multidisciplinary Mathematics, Interdisciplinary Applications

资金

Indonesian Endowment Fund for Education (LPDP), Ministry of Finance
Directorate General of Higher Education (DIKTI), Ministry of Education and Culture, The Republic of Indonesia

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper introduces a novel framework for gauging the ratings of online reviews using machine learning techniques and evaluates different classifiers and feature extraction methods to find that linear-kernel support vector machine, logistic regression, and multilayer perceptron are the top three single classifiers. Furthermore, by leveraging these classifiers as base classifiers in ensemble models, their performance can be further enhanced. Additionally, various text pre-processing techniques and sentiment lexicons are explored to improve accuracy, with deep learning models also yielding promising results.

Online reviews are becoming increasingly important for decision-making. Consumers often refer to online reviews for opinions before making a purchase. Marketers also acknowledge the importance of online reviews and use them to improve product success. However, the massive amount of online review data, as well as its unstructured nature, is a challenge for anyone wanting to derive a conclusion quickly. In this paper, we propose a novel framework for gauging the ratings of online reviews using machine learning techniques. This framework uses a combination of text pre-processing and feature extraction methods. Here, we investigate four different aspects of the new framework. First, we assess the performance of single and ensemble classifiers in predicting sentiment-positive or negative-initially on a specific dataset (Yelp), but subsequently also on two other datasets (Amazon's product reviews and a movie review dataset). Second, using the best identified classifiers, we improve the accuracy with which neutral polarity can be predicted, an ability largely overlooked in the literature. Third, we further improve the performance of these classifiers by testing different pre-processing and feature extraction methods. Finally, we measure how well our deep learning approach performs on the same task compared to the best previously identified classifiers. Our extensive testing shows that the linear-kernel support vector machine, logistic regression and multilayer perceptron are the three best single classifiers in terms of accuracy, precision, recall, and F-measure. Their performance could be further improved if they were used as base classifiers for ensemble models. We also observe that several text pre-processing techniques-negation word identification, word elongation correction, and part of speech lemmatisation (combined with Terms Frequency and N-gram words)-can increase accuracy. In addition, we demonstrate that the general sentiment of lexicons such as SentiWordNet 3.0 and SenticNet 4 can be used to generate features with good results, although deep learning models can perform equally well. Experiments with different datasets confirm that our framework provides consistent outcomes. In particular, we have focused on improving the accuracy of neutral sentiment, and we conclude by showing how this can be achieved without sacrificing the accuracy of positive or negative ratings.

Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

期刊

ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

期刊

ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文