4.7 Article

LSTM Based Phishing Detection for Big Email Data

期刊

IEEE TRANSACTIONS ON BIG DATA
卷 8, 期 1, 页码 278-288

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TBDATA.2020.2978915

关键词

Phishing email; LSTM; social engineering

资金

  1. National Key Research and Development Program [2019QY1400, 2018YFB0804503]
  2. National Key Project [GJXM92579]
  3. National Natural Science Foundation of China [U1836103]
  4. Technology Research and Development Program of Sichuan, China [2019YFG0390]

向作者/读者索取更多资源

Phishing emails are becoming more complex, making existing detection methods inadequate. This article introduces an LSTM-based phishing detection method that achieves 95% accuracy through sample expansion and testing stages.
In recent years, cyber criminals have successfully invaded many important information systems by using phishing mail, causing huge losses. The detection of phishing mail from big email data has been paid public attention. However, the camouflage technology of phishing mail is becoming more and more complex, and the existing detection methods are unable to confront with the increasingly complex deception methods and the growing number of emails. In this article, we proposed an LSTM based phishing detection method for big email data. The new method includes two important stages, sample expansion stage and testing stage under sufficient samples. In the sample expansion stage, we combined KNN with K-Means to expand the training data set, so that the size of training samples can meet the needs of in-depth learning. In the testing stage, we first preprocess these samples, including generalization, word segmentation and word vector generation. Then, the preprocessed data is used to train a LSTM model. Finally, on the basis of the trained model, we classify the phishing emails. By experiment, we evaluate the performance of the proposed method, and experimental results show that the accuracy of our phishing detection method can reach 95 percent.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据