4.6 Article

A Novel Solutions for Malicious Code Detection and Family Clustering Based on Machine Learning

Journal

IEEE ACCESS
Volume 7, Issue -, Pages 148853-148860

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2019.2946482

Keywords

Malware; ensemble model; malware classification; family clustering; t-SNE

Funding

  1. National Natural Science Foundation of China (NSFC) [61672020, U1803263, 61972106, U1636215, 61972105]
  2. National Key Research and Development Program of China [2019QY1406]
  3. Key Research and Development Program of Guangdong Province [2019B010136003]

Ask authors/readers for more resources

Malware has become a major threat to cyberspace security, not only because of the increasing complexity of malware itself, but also because of the continuously created and produced malicious code. In this paper, we propose two novel methods to solve the malware identification problem. One is to solve to malware classification. Different from traditional machine learning, our method introduces the ensemble models to solve the malware classification problem. The other is to solve malware family clustering. Different from the classic malware family clustering algorithm, our method introduces the t-SNE algorithm to visualize the feature data and then determines the number of malware families. The two proposed novel methods have been extensively tested on a large number of real-world malware samples. The results show that the first one is far superior to the existed individual models and the second one has a good adaptation ability. Our methods can be used for malicious code classification and family clustering, also with higher accuracy.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available