☆ 4.6 Article

Web2Vec: Phishing Webpage Detection Method Based on Multidimensional Features Driven by Deep Learning

IEEE ACCESS (2020)

Journal

IEEE ACCESS

Volume 8, Issue -, Pages 221214-221224

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2020.3043188

Keywords

Feature extraction; Phishing; Uniform resource locators; Deep learning; Semantics; Licenses; Learning systems; Attention mechanism; deep learning; phishing; representation learning

Funding

Shaanxi Provincial Natural Science Foundation [2020JM-533, 2018JQ5095]
Chinese Postdoctoral Science Foundation [2020M673446]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Phishing is a kind of online attack that attempts to defraud sensitive information of network users. Current phishing webpage detection methods mainly use manual feature collection, and there are problems that feature extraction is complicated and the possible correlation between features cannot be avoided. To solve the problems, a new phishing webpage detection model is proposed, among which the main components are automatic learning representations from multi-aspects features through representation learning and extracting features by hybrid deep learning network. Firstly, the model treats URL, HTML page content, and DOM (Document Object Model) structure of webpages as character sequences respectively, and uses representation learning technology to automatically learn the representation of the webpages; then, sends multiple representations to a hybrid deep learning network composed of a convolutional neural network and a bidirectional long and short-term memory network through different channels to extract local and global features, and use the attention mechanism to strengthen the influence of important features; finally, the output of multiple channels is fused to realize classification prediction. Through four sets of experiments to verify the detection effect of the model, the results show that the overall classification effect of the model is better than the existing classic phishing webpage detection methods, the accuracy reaches 99.05%, and the false positive rate is only 0.25%. It is proved that the strategies of extracting webpage features from all aspects through representation learning and hybrid deep learning network can effectively improve the detection effect of phishing webpages.

Web2Vec: Phishing Webpage Detection Method Based on Multidimensional Features Driven by Deep Learning

Journal

IEEE ACCESS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Web2Vec: Phishing Webpage Detection Method Based on Multidimensional Features Driven by Deep Learning

Journal

IEEE ACCESS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper