☆ 3.8 Proceedings Paper

Android Malware Detection Through a Pre-trained Model for Code Understanding

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING & AMBIENT INTELLIGENCE (UCAMI 2022) (2023)

期刊

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING & AMBIENT INTELLIGENCE (UCAMI 2022)

卷 594, 期 -, 页码 1055-1060

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

DOI: 10.1007/978-3-031-21333-5_105

关键词

Android; Malware; Pre-trained model; Embedding; CodeT5

类别

Computer Science, Artificial Intelligence Computer Science, Cybernetics Computer Science, Theory & Methods Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study utilizes CodeT5 pre-trained language model to generate context and semantic aware embeddings for a better representation of the behavior of Android applications. It shows how these embeddings can be used to train a recurrent neural network for malware detection tasks, and presents promising results.

Despite the large number of approaches proposed for detecting malicious applications targeting platforms such as Android, malware continuously evolves in order to avoid its detection and reach the users. Likewise, malware detection engines are continuously improved, trying to detect the most modern malware. Most of these detection tools employ signatures or machine learning models, trained on thousands of features, such as API calls, permissions or using taint analysis, among many others, and using machine learning classification algorithms such as decision trees, ensemble methods or deep learning. However, the use of these features leads to biased models due to the use of limited datasets, without considering the real semantics (goals and intentions) of the malicious sample. In this paper, we conduct an initial study of the use of context and semantic aware embeddings generated with the CodeT5 pre-trained language model for a better representation of the behaviour of Android applications. After decompiling a sample to Java, it is possible to generate embeddings from chunks of the source code, generating a rich representation of the sample. We show how these embeddings can be used to train a recurrent neural network for malware detection tasks, evidencing promising results.

Android Malware Detection Through a Pre-trained Model for Code Understanding

期刊

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING & AMBIENT INTELLIGENCE (UCAMI 2022)

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Android Malware Detection Through a Pre-trained Model for Code Understanding

期刊

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING & AMBIENT INTELLIGENCE (UCAMI 2022)

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文