4.6 Article

Detecting White Supremacist Hate Speech Using Domain Specific Word Embedding With Deep Learning and BERT

期刊

IEEE ACCESS
卷 9, 期 -, 页码 106363-106374

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3100435

关键词

Social networking (online); Deep learning; Blogs; Bit error rate; Support vector machines; Semantics; Media; BERT; deep learning; NLP; white supremacist; hate speech; Twitter

向作者/读者索取更多资源

White supremacist hate speech has become a harmful content on social media, with detrimental impacts on society, highlighting the need for automatic detection of such speech. This research explores the feasibility of detecting white supremacist hate speech on Twitter using deep learning and natural language processing techniques, utilizing BiLSTM and BERT models for this purpose. The BiLSTM model achieved a 0.75 F1-score, while BERT reached a higher 0.80 F1-score, both tested on a balanced dataset from Twitter and a Stormfront dataset.
White supremacist hate speech is one of the most recently observed harmful content on social media. The critical influence of these radical groups is no longer limited to social media and can negatively affect society by promoting racial hatred and violence. Traditional channels of reporting hate speech have proved inadequate due to the tremendous explosion of information and the implicit nature of hate speech. Therefore, it is necessary to detect such speech automatically and in a timely manner. This research investigates the feasibility of automatically detecting white supremacist hate speech on Twitter using deep learning and natural language processing techniques. Two deep learning models are investigated in this research. The first approach utilizes a bidirectional Long Short-Term Memory (BiLSTM) model along with domain-specific word embeddings extracted from white supremacist corpus to capture the semantic of white supremacist slangs and coded words. The second approach utilizes one of the most recent language models, which is Bidirectional Encoder Representations from Transformers (BERT). The BiLSTM model achieved 0.75 F1-score and BERT reached a 0.80 F1-score. Both models are tested on a balanced dataset combined from Twitter and a Stormfront dataset compiled from white supremacist forum.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据