☆ 4.6 Article

A Novel Approach for Emotion Detection and Sentiment Analysis for Low Resource Urdu Language Based on CNN-LSTM

ELECTRONICS (2022)

期刊

ELECTRONICS

卷 11, 期 24, 页码 -

出版社

MDPI

DOI: 10.3390/electronics11244096

关键词

emotion detection; sentiment analysis; Roman Urdu; machine learning; deep learning

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Physics, Applied

资金

Vice presidency for Graduate Studies, Business and Scientific Research (GBR) at Dar Al Hekma University, Jeddah Saudi Arabia

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Emotion detection and sentiment analysis are important in identifying individuals' interest levels. While English and Chinese have received much attention in the field, the research on poor-resource languages like Urdu has been lacking. This study focuses on Roman Urdu and proposes a CNN-LSTM method with Word2Vec for emotion detection and sentiment analysis, which outperforms other approaches. The accuracy of emotion detection increased from 85% to 95%, and sentiment analysis improved from 89% to 93.3%.

Emotion detection (ED) and sentiment analysis (SA) play a vital role in identifying an individual's level of interest in any given field. Humans use facial expressions, voice pitch, gestures, and words to convey their emotions. Emotion detection and sentiment analysis in English and Chinese have received much attention in the last decade. Still, poor-resource languages such as Urdu have been mostly disregarded, which is the primary focus of this research. Roman Urdu should also be investigated like other languages because social media platforms are frequently used for communication. Roman Urdu faces a significant challenge in the absence of corpus for emotion detection and sentiment analysis because linguistic resources are vital for natural language processing. In this study, we create a corpus of 1021 sentences for emotion detection and 20,251 sentences for sentiment analysis, both obtained from various areas, and annotate it with the aid of human annotators from six and three classes, respectively. In order to train large-scale unlabeled data, the bag-of-word, term frequency-inverse document frequency, and Skip-gram models are employed, and the learned word vector is then fed into the CNN-LSTM model. In addition to our proposed approach, we also use other fundamental algorithms, including a convolutional neural network, long short-term memory, artificial neural networks, and recurrent neural networks for comparison. The result indicates that the CNN-LSTM proposed method paired with Word2Vec is more effective than other approaches regarding emotion detection and evaluating sentiment analysis in Roman Urdu. Furthermore, we compare our based model with some previous work. Both emotion detection and sentiment analysis have seen significant improvements, jumping from an accuracy of 85% to 95% and from 89% to 93.3%, respectively.

A Novel Approach for Emotion Detection and Sentiment Analysis for Low Resource Urdu Language Based on CNN-LSTM

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A Novel Approach for Emotion Detection and Sentiment Analysis for Low Resource Urdu Language Based on CNN-LSTM

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文