3.8 Proceedings Paper

Ensemble Model Ransomware Classification: A Static Analysis-based Approach

Publisher

SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-981-16-6723-7_12

Keywords

Tf-idf; N-gram; Random forest; Opcode; SVM; Ransomware; Voting classifier

Funding

  1. Science and Engineering Research Board (SERB), the Department of Science and Technology (DST), Government of India (GoI)
  2. DST-SERB [ECR/2018/001709]

Ask authors/readers for more resources

The malware attack has been growing exponentially, with the COVID-19 pandemic increasing the reliance on digital technology. This has pushed the anti-malware communities to develop better software to detect and mitigate these attacks. We propose a static analysis method coupled with machine learning to classify ransomware, effectively improving the classification model's efficiency.
The growth of malware attacks has been phenomenal in the recent past. The COVID-19 pandemic has contributed to an increase in the dependence of a larger than usual workforce on digital technology. This has forced the anti-malware communities to build better software to mitigate malware attacks by detecting it before they wreak havoc. The key part of protecting a system from a malware attack is to identify whether a given file/software is malicious or not. Ransomware attacks are time-sensitive as they must be stopped before the attack manifests as the damage will be irreversible once the attack reaches a certain stage. Dynamic analysis employs a great many methods to decipher the way ransomware files behave when given a free rein. But, there still exists a risk of exposing the system to malicious code while doing that. Ransomware that can sense the analysis environment will most certainly elude the methods used in dynamic analysis. We propose a static analysis method along with machine learning for classifying the ransomware using opcodes extracted by disassemblers. By selecting the most appropriate feature vectors through the tf-idf feature selection method and tuning the parameters that better represent each class, we can increase the efficiency of the ransomware classification model. The ensemble learning-based model implemented on top of N-gram sequence of static opcode data was found to improve the performance significantly in comparison to RF, SVN, LR, and GBDT models when tested against a dataset consisting of live encrypting ransomware samples that had evasive technique to dodge dynamic malware analysis.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available