4.8 Review

A systematic literature review on phishing website detection techniques

Publisher

ELSEVIER
DOI: 10.1016/j.jksuci.2023.01.004

Keywords

Phishing; Phishing Detection; Deep Learning; Cyber Security; Machine Learning

Ask authors/readers for more resources

This study presents a systematic literature survey on various phishing detection approaches, including Lists Based, Visual Similarity, Heuristic, Machine Learning, and Deep Learning based techniques. The research reveals that Machine Learning techniques, particularly the Random Forest Classifier algorithm, are widely used in phishing detection. Furthermore, the Convolution Neural Network (CNN) achieves the highest accuracy of 99.98% in detecting phishing websites according to different studies.
Phishing is a fraud attempt in which an attacker acts as a trusted person or entity to obtain sensitive information from an internet user. In this Systematic Literature Survey (SLR), different phishing detection approaches, namely Lists Based, Visual Similarity, Heuristic, Machine Learning, and Deep Learning based techniques, are studied and compared. For this purpose, several algorithms, data sets, and techniques for phishing website detection are revealed with the proposed research questions. A systematic Literature survey was conducted on 80 scientific papers published in the last five years in research journals, confer-ences, leading workshops, the thesis of researchers, book chapters, and from high-rank websites. The work carried out in this study is an update in the previous systematic literature surveys with more focus on the latest trends in phishing detection techniques. This study enhances readers' understanding of dif-ferent types of phishing website detection techniques, the data sets used, and the comparative perfor-mance of algorithms used. Machine Learning techniques have been applied the most, i.e., 57 as per studies, according to the SLR. In addition, the survey revealed that while gathering the data sets, research -ers primarily accessed two sources: 53 studies accessed the PhishTank website (53 for the phishing data set) and 29 studies used Alexa's website for downloading legitimate data sets. Also, as per the literature survey, most studies used Machine Learning techniques; 31 used Random Forest Classifier. Finally, as per different studies, Convolution Neural Network (CNN) achieved the highest Accuracy, 99.98%, for detecting phishing websites.& COPY; 2023 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available