☆ 4.7 Article

Exploiting Textual Information for Fake News Detection

INTERNATIONAL JOURNAL OF NEURAL SYSTEMS (2022)

期刊

INTERNATIONAL JOURNAL OF NEURAL SYSTEMS

卷 32, 期 12, 页码 -

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD

DOI: 10.1142/S0129065722500587

关键词

Fake news; Machine Learning (ML); Artificial Neural Networks (ANN); Natural Language Processing (NLP); Association Rules Mining (ARM)

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper assesses the accuracy of machine learning algorithms in detecting fake news and proposes a method of enhancing linguistic feature set using named entity recognition and association rule mining algorithm. Different training/test feature sets are provided by mixing document embeddings with linguistic features. The results show that convolutional neural network performs the best, but support vector machine achieves similar accuracy with a wider variety of input feature sets.

Fake news refers to the deliberate dissemination of news with the purpose to deceive and mislead the public. This paper assesses the accuracy of several Machine Learning (ML) algorithms, using a style-based technique that relies on textual information extracted from news, such as part of speech counts. To expand the already proposed styled-based techniques, a new method of enhancing a linguistic feature set is proposed. It combines Named Entity Recognition (NER) with the Frequent Pattern (FP) Growth association rule mining algorithm, aiming to provide better insight into the papers' sentence level structure. Recursive feature elimination was used to identify a subset of the highest performing linguistic characteristics, which turned out to align with the literature. Using pre-trained word embeddings, document embeddings and weighted document embeddings were constructed using each word's TF-IDF value as the weight factor. The document embeddings were mixed with the linguistic features providing a variety of training/test feature sets. For each model, the best performing feature set was identified and fine-tuned regarding its hyper parameters to improve accuracy. ML algorithms' results were compared with two Neural Networks: Convolutional Neural Network (CNN) and Long-Short-Term Memory (LSTM). The results indicate that CNN outperformed all other methods in terms of accuracy, when companied with pre-trained word embeddings, yet SVM performs almost the same with a wider variety of input feature sets. Although style-based technique scores lower accuracy, it provides explainable results about the author's writing style decisions. Our work points out how new technologies and combinations of existing techniques can enhance the style-based approach capturing more information.

Exploiting Textual Information for Fake News Detection

期刊

INTERNATIONAL JOURNAL OF NEURAL SYSTEMS

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Exploiting Textual Information for Fake News Detection

期刊

INTERNATIONAL JOURNAL OF NEURAL SYSTEMS

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文