3.8 Proceedings Paper

Phishing URL Detection via CNN and Attention-Based Hierarchical RNN

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TrustCom/BigDataSE.2019.00024

Keywords

Phishing Detection; Cyber Security; Machine Learning; Deep Learning

Funding

  1. National Key RD of China [2017YFB0202502, SQ2017YFGX060171-03]

Ask authors/readers for more resources

Phishing websites have long been a serious threat to cyber security. For decades, many researchers have been devoted to developing novel techniques to detect phishing websites automatically. While state-of-the-art solutions can achieve superior performances, they require substantial manual feature engineering and are not adept at detecting newly emerging phishing attacks. Therefore, developing techniques that can detect phishing websites automatically and handle zero-day phishing attacks swiftly is still an open challenge in this area. In this work, we propose PhishingNet, a deep learning-based approach for timely detection of phishing Uniform Resource Locators (URLs). Specifically, we use a Convolutional Neural Network (CNN) module to extract character-level spatial feature representations of URLs; meanwhile, we employ an attention-based hierarchical Recurrent Neural Network(RNN) module to extract word-level temporal feature representations of URLs. We then fuse these feature representations via a three-layer CNN to build accurate feature representations of URLs, on which we train a phishing URL classifier. Extensive experiments on a verified dataset collected from the Internet demonstrate that the feature representations extracted automatically are conducive to the improvement of the generalization ability of our approach on newly emerging URLs, which makes our approach achieve competitive performance against other state-of-the-art approaches.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available