4.0 Article

An automated learning model for sentiment analysis and data classification of Twitter data using balanced CA-SVM

Journal

CONCURRENT ENGINEERING-RESEARCH AND APPLICATIONS
Volume 29, Issue 4, Pages 386-395

Publisher

SAGE PUBLICATIONS LTD
DOI: 10.1177/1063293X211031485

Keywords

sentiment analysis; big data; Hadoop Distributed File System; SVM; big data; k-means; ANN; TSS; machine learning

Ask authors/readers for more resources

Modern society spends most of their time on social media, where sentiment analysis on Twitter data set helps identify user emotions and solve problems efficiently. The proposed model, utilizing CA-SVM, preprocesses Twitter data, clusters tweets using TGS-K means clustering, and classifies them using support vector machine, achieving high accuracy and sentiment score.
The modern society runs over the social media for their most time of every day. The web users spend their most time in social media and they share many details with their friends. Such information obtained from their chat has been used in several applications. The sentiment analysis is the one which has been applied with Twitter data set toward identifying the emotion of any user and based on those different problems can be solved. Primarily, the data as of the Twitter database is preprocessed. In this step, tokenization, stemming, stop word removal, and number removal are done. The proposed automated learning with CA-SVM based sentiment analysis model reads the Twitter data set. After that they have been processed to extract the features which yield set of terms. Using the terms, the tweets are clustered using TGS-K means clustering which measures Euclidean distance according to different features like semantic sentiment score (SSS), gazetteer and symbolic sentiment support (GSSS), and topical sentiment score (TSS). Further, the method classifies the tweets according to support vector machine (CA-SVM) which classifies the tweet according to the support value which is measured based on the above two measures. The attained results are validated utilizing k-fold cross-validation methodology. Then, the classification is performed by utilizing the Balanced CA-SVM (Deep Learning Modified Neural Network). The results are evaluated and compared with the existing works. The Proposed model achieved 92.48 % accuracy and 92.05% sentiment score contrasted with the existing works.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.0
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available