☆ 4.6 Article

Detecting White Supremacist Hate Speech Using Domain Specific Word Embedding With Deep Learning and BERT

IEEE ACCESS (2021)

期刊

IEEE ACCESS

卷 9, 期 -, 页码 106363-106374

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2021.3100435

关键词

Social networking (online); Deep learning; Blogs; Bit error rate; Support vector machines; Semantics; Media; BERT; deep learning; NLP; white supremacist; hate speech; Twitter

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

White supremacist hate speech has become a harmful content on social media, with detrimental impacts on society, highlighting the need for automatic detection of such speech. This research explores the feasibility of detecting white supremacist hate speech on Twitter using deep learning and natural language processing techniques, utilizing BiLSTM and BERT models for this purpose. The BiLSTM model achieved a 0.75 F1-score, while BERT reached a higher 0.80 F1-score, both tested on a balanced dataset from Twitter and a Stormfront dataset.

White supremacist hate speech is one of the most recently observed harmful content on social media. The critical influence of these radical groups is no longer limited to social media and can negatively affect society by promoting racial hatred and violence. Traditional channels of reporting hate speech have proved inadequate due to the tremendous explosion of information and the implicit nature of hate speech. Therefore, it is necessary to detect such speech automatically and in a timely manner. This research investigates the feasibility of automatically detecting white supremacist hate speech on Twitter using deep learning and natural language processing techniques. Two deep learning models are investigated in this research. The first approach utilizes a bidirectional Long Short-Term Memory (BiLSTM) model along with domain-specific word embeddings extracted from white supremacist corpus to capture the semantic of white supremacist slangs and coded words. The second approach utilizes one of the most recent language models, which is Bidirectional Encoder Representations from Transformers (BERT). The BiLSTM model achieved 0.75 F1-score and BERT reached a 0.80 F1-score. Both models are tested on a balanced dataset combined from Twitter and a Stormfront dataset compiled from white supremacist forum.

Detecting White Supremacist Hate Speech Using Domain Specific Word Embedding With Deep Learning and BERT

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Detecting White Supremacist Hate Speech Using Domain Specific Word Embedding With Deep Learning and BERT

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文