4.7 Article

CNN-Fusion: An effective and lightweight phishing detection method based on multi-variant ConvNet

Journal

INFORMATION SCIENCES
Volume 631, Issue -, Pages 328-345

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2023.02.039

Keywords

Phishing detection; Deep learning; Convolutional neural network; Phishing attacks; Malicious websites

Ask authors/readers for more resources

Phishing scams are on the rise and require rapid, precise, and low-cost prevention measures. CNN-Fusion, a character-level convolutional neural network, is proposed as an effective and lightweight method for detecting phishing URLs. It utilizes parallel one-layer CNN variants with different-sized kernels and applies techniques like SpatialDropout1D and max-over time pooling to enhance its robustness and feature selection. Evaluation on publicly available datasets and against AI adversarial attacks shows superior performance compared to existing methods with significantly reduced training time and memory consumption, achieving an average accuracy above 99%.
Phishing scams are increasing as the technical skills and costs of phishing attacks diminish, emphasizing the need for rapid, precise, and low-cost prevention measures. Based on a character-level convolutional neural network (CNN), we present CNN-Fusion, an effective and lightweight phishing URL detection method. Our basic idea is to deploy multiple variants of one-layer CNN with various-sized kernels in parallel to extract multi-level features. Observing that differences between phishing and benign URLs might exhibit a strong spatial correlation, we choose SpatialDropout1D, making the model more robust and preventing it from memorizing the training data. To further reduce the probability of errors that may cause by irrelevant or noisy features, we apply a max-over time pooling technique over the feature map to pick only the most important feature. Finally, the model is evaluated using five publicly available datasets containing 1.85 million phishing and benign URLs. Other than that, we assess the model against AI adversarial attacks, known as Offensive AI. Compared to existing methods, experiments demonstrate that our approach enjoys advantages in 5 times less training time and much more in memory consumption, achieving an average accuracy above 99% on five different datasets as well as on AI-generated malicious attacks.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available