☆ 4.6 Article

Multi-Language Spam/Phishing Classification by Email Body Text: Toward Automated Security Incident Investigation

ELECTRONICS (2021)

期刊

ELECTRONICS

卷 10, 期 6, 页码 -

出版社

MDPI

DOI: 10.3390/electronics10060668

关键词

spam; phishing; classification; augmented dataset; multi-language emails

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Physics, Applied

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study presents an automated classification solution based on email message body text, targeting spam and phishing emails. The research explores the limitations of using public datasets to evaluate the necessity of dataset updates for more accurate classification results.

Spamming and phishing are two types of emailing that are annoying and unwanted, differing by the potential threat and impact to the user. Automated classification of these categories can increase the users' awareness as well as to be used for incident investigation prioritization or automated fact gathering. However, currently there are no scientific papers focusing on email classification concerning these two categories of spam and phishing emails. Therefore this paper presents a solution, based on email message body text automated classification into spam and phishing emails. We apply the proposed solution for email classification, written in three languages: English, Russian, and Lithuanian. As most public email datasets almost exclusively collect English emails, we investigate the suitability of automated dataset translation to adapt it to email classification, written in other languages. Experiments on public dataset usage limitations for a specific organization are executed in this paper to evaluate the need of dataset updates for more accurate classification results.

Multi-Language Spam/Phishing Classification by Email Body Text: Toward Automated Security Incident Investigation

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-Language Spam/Phishing Classification by Email Body Text: Toward Automated Security Incident Investigation

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文