☆ 4.4 Article

Improving spam email classification accuracy using ensemble techniques: a stacking approach

INTERNATIONAL JOURNAL OF INFORMATION SECURITY (2023)

期刊

INTERNATIONAL JOURNAL OF INFORMATION SECURITY

卷 -, 期 -, 页码 -

出版社

SPRINGER

DOI: 10.1007/s10207-023-00756-1

关键词

Spam; Email; Classification; Machine learning; Ensemble; Stacking method

类别

Computer Science, Information Systems Computer Science, Software Engineering Computer Science, Theory & Methods

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study focused on enhancing spam email classification accuracy using stacking ensemble machine learning techniques. The results demonstrated superior performance of the stacking method with the highest accuracy, recall, and F1 score among tested methods. The study presents an innovative combination of classifiers, contributing to the growing body of research on stacking techniques.

Spam emails pose a substantial cybersecurity danger, necessitating accurate classification to reduce unwanted messages and mitigate risks. This study focuses on enhancing spam email classification accuracy using stacking ensemble machine learning techniques. We trained and tested five classifiers: logistic regression, decision tree, K-nearest neighbors (KNN), Gaussian naive Bayes and AdaBoost. To address overfitting, two distinct datasets of spam emails were aggregated and balanced. Evaluating individual classifiers based on recall, precision and F1 score metrics revealed AdaBoost as the top performer. Considering evolving spam technology and new message types challenging traditional approaches, we propose a stacking method. By combining predictions from multiple base models, the stacking method aims to improve classification accuracy. The results demonstrate superior performance of the stacking method with the highest accuracy (98.8%), recall (98.8%) and F1 score (98.9%) among tested methods. Additional experiments validated our approach by varying dataset sizes and testing different classifier combinations. Our study presents an innovative combination of classifiers that significantly improves accuracy, contributing to the growing body of research on stacking techniques. Moreover, we compare classifier performances using a unique combination of two datasets, highlighting the potential of ensemble techniques, specifically stacking, in enhancing spam email classification accuracy. The implications extend beyond spam classification systems, offering insights applicable to other classification tasks. Continued research on emerging spam techniques is vital to ensure long-term effectiveness.

Improving spam email classification accuracy using ensemble techniques: a stacking approach

期刊

INTERNATIONAL JOURNAL OF INFORMATION SECURITY

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving spam email classification accuracy using ensemble techniques: a stacking approach

期刊

INTERNATIONAL JOURNAL OF INFORMATION SECURITY

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文