☆ 4.6 Article

Emoji-powered Sentiment and Emotion Detection from Software Developers' Communication Data

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2021)

期刊

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY

卷 30, 期 2, 页码 -

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3424308

关键词

Emoji; sentiment; emotion; software engineering

类别

Computer Science, Software Engineering

资金

National Key R&D Program of China [2018YFB1004800]
Beijing Outstanding Young Scientist Program [BJJWZYJH01201910001004]
National Natural Science Foundation of China [J1924032, 61725201]
Key Laboratory of Intelligent Passenger Service of Civil Aviation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Using emojis for sentiment and emotion detection can help address the issue of limited labeled data and has shown significant improvement in the field of software engineering.

Sentiment and emotion detection from textual communication records of developers have various application scenarios in software engineering (SE). However, commonly used off-the-shelf sentiment/emotion detection tools cannot obtain reliable results in SE tasks and misunderstanding of technical knowledge is demonstrated to be the main reason. Then researchers start to create labeled SE-related datasets manually and customize SE-specific methods. However, the scarce labeled data can cover only very limited lexicon and expressions. In this article, we employ emojis as an instrument to address this problem. Different from manual labels that are provided by annotators, emojis are self-reported labels provided by the authors themselves to intentionally convey affective states and thus are suitable indications of sentiment and emotion in texts. Since emojis have been widely adopted in online communication, a large amount of emoji-labeled texts can be easily accessed to help tackle the scarcity of the manually labeled data. Specifically, we leverage Tweets and GitHub posts containing emojis to learn representations of SE-related texts through emoji prediction. By predicting emojis containing in each text, texts that tend to surround the same emoji are representedwith similar vectors, which transfers the sentiment knowledge contained in emoji usage to the representations of texts. Then we leverage the sentiment-aware representations as well as manually labeled data to learn the final sentiment/emotion classifier via transfer learning. Compared to existing approaches, our approach can achieve significant improvement on representative benchmark datasets, with an average increase of 0.036 and 0.049 in macro-F1 in sentiment and emotion detection, respectively. Further investigations reveal that the large-scale Tweets make a key contribution to the power of our approach. This finding informs future research not to unilaterally pursue the domain-specific resource but try to transform knowledge from the open domain through ubiquitous signals such as emojis. Finally, we present the open challenges of sentiment and emotion detection in SE through a qualitative analysis of texts misclassified by our approach.

Emoji-powered Sentiment and Emotion Detection from Software Developers' Communication Data

期刊

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Emoji-powered Sentiment and Emotion Detection from Software Developers' Communication Data

期刊

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文