4.7 Article

Algorithmically generated malicious domain names detection based on n-grams features

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 170, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2020.114551

Keywords

Domain generation algorithm; Botnet; Machine learning; DNS query; Kullback-Leibner divergence; Jaccard Index

Ask authors/readers for more resources

This paper presents a methodology for detecting DGA generated domain names using supervised machine learning, which achieves good accuracy and effectiveness in classifying previously unseen domains. The approach is based on lexical features and outperforms some state-of-the-art featureless classification methods based on deep learning.
Botnets are one of the major cyber infections used in several criminal activities. In most botnets, a Domain Generation Algorithm (DGA) is used by bots to make DNS queries aimed at establishing the connection with the Command and Control (C&C) server. The identification of such queries by monitoring the network DNS traffic is then crucial for bot detection. In this paper we present a methodology to detect DGA generated domain names based on a supervised machine learning process, trained with a dataset of known benign and malicious domain names. The proposed approach represents the domain names through a set of features which express the similarity between the 2-grams and 3-grams in a single unclassified domain name and those in domain names known as malicious or benign. We used the Kullback-Leibner divergence and the Jaccard Index to estimate the similarity, and we tested different machine learning algorithms to classify each domain name as benign or DGA-based (with both binary and multi-class approach). The results of our experiments demonstrate that the proposed methodology, which only exploits lexical features of domain names, attains a good level of accuracy and results in a general model able to classify previously unseen domains in an effective way. It is also able to outperform some of the state-of-the-art featur eless classification methods based on deep learning.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available