Journal
INVENTIVE COMPUTATION AND INFORMATION TECHNOLOGIES, ICICIT 2021
Volume 336, Issue -, Pages 153-167Publisher
SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-981-16-6723-7_12
Keywords
Tf-idf; N-gram; Random forest; Opcode; SVM; Ransomware; Voting classifier
Funding
- Science and Engineering Research Board (SERB), the Department of Science and Technology (DST), Government of India (GoI)
- DST-SERB [ECR/2018/001709]
Ask authors/readers for more resources
The malware attack has been growing exponentially, with the COVID-19 pandemic increasing the reliance on digital technology. This has pushed the anti-malware communities to develop better software to detect and mitigate these attacks. We propose a static analysis method coupled with machine learning to classify ransomware, effectively improving the classification model's efficiency.
The growth of malware attacks has been phenomenal in the recent past. The COVID-19 pandemic has contributed to an increase in the dependence of a larger than usual workforce on digital technology. This has forced the anti-malware communities to build better software to mitigate malware attacks by detecting it before they wreak havoc. The key part of protecting a system from a malware attack is to identify whether a given file/software is malicious or not. Ransomware attacks are time-sensitive as they must be stopped before the attack manifests as the damage will be irreversible once the attack reaches a certain stage. Dynamic analysis employs a great many methods to decipher the way ransomware files behave when given a free rein. But, there still exists a risk of exposing the system to malicious code while doing that. Ransomware that can sense the analysis environment will most certainly elude the methods used in dynamic analysis. We propose a static analysis method along with machine learning for classifying the ransomware using opcodes extracted by disassemblers. By selecting the most appropriate feature vectors through the tf-idf feature selection method and tuning the parameters that better represent each class, we can increase the efficiency of the ransomware classification model. The ensemble learning-based model implemented on top of N-gram sequence of static opcode data was found to improve the performance significantly in comparison to RF, SVN, LR, and GBDT models when tested against a dataset consisting of live encrypting ransomware samples that had evasive technique to dodge dynamic malware analysis.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available