☆ 4.5 Article

A feature-hybrid malware variants detection using CNN based opcode embedding and BPNN based API embedding

COMPUTERS & SECURITY (2019)

期刊

COMPUTERS & SECURITY

卷 84, 期 -, 页码 376-392

出版社

ELSEVIER ADVANCED TECHNOLOGY

DOI: 10.1016/j.cose.2019.04.005

关键词

API call; Back-propagation neural network; Convolutional neural network; Feature-hybrid; Malware variants detection; Malware family classification; Opcode

类别

Computer Science, Information Systems

资金

National Natural Science foundation of China [61772191, 61472131]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Being able to detect malware variants is a critical problem due to the potential damages and the fast paces of new malware variations. According to surveys from McAfee and Symantec, there is about 69 new instances of malware detected in every minutes, and more than 50% of them are variants of existing ones. Such a large volume of diversified malware variants has forced researches to investigate new methods based on common behavior patterns using machine learning. However, such methods only use single type of features such as opcode, system call, etc., which faces several drawbacks: Firstly, the methods lose a part of useful information since different types of features show different characteristics of malware. This severely limits detection precision and recall. Secondly, the accuracy and the speed (as a trade-off) of such methods fail to meet users' expectation. Thirdly, the precise classification of malware families is still a hard problem and is also important in malware analysis. In this work, we propose a feature-hybrid malware variants detection approach which integrates multi-types of features to address these challenges. We first represent opcodes by a bi-gram model and represent API calls by a vector of frequency, then we use principal component analysis to optimize the representations to improve the convergence speed, the next we adopt a convolutional neural network and a back-propagation neural network for opcode based feature embedding and API based feature embedding respectively, and finally we embed these features to train a detection model by using softmax. Theoretical analysis and real-life experimental results show the efficiency and optimization of our approach which achieves more than 95% malware detection accuracy and almost 90% classification accuracy of malware families. The detection speed of our approach is less than 0.1 s. (C) 2019 Elsevier Ltd. All rights reserved.

A feature-hybrid malware variants detection using CNN based opcode embedding and BPNN based API embedding

期刊

COMPUTERS & SECURITY

出版社

ELSEVIER ADVANCED TECHNOLOGY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A feature-hybrid malware variants detection using CNN based opcode embedding and BPNN based API embedding

期刊

COMPUTERS & SECURITY

出版社

ELSEVIER ADVANCED TECHNOLOGY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文