4.5 Article

Vulnerability detection through cross-modal feature enhancement and fusion

Journal

COMPUTERS & SECURITY
Volume 132, Issue -, Pages -

Publisher

ELSEVIER ADVANCED TECHNOLOGY
DOI: 10.1016/j.cose.2023.103341

Keywords

Software security; Multimodal deep learning; Fine-grained cross modal alignment; Co-attention; Vulnerability detection

Ask authors/readers for more resources

This paper proposes a new multimodal deep learning based vulnerability detection method that achieves improved performance through cross-modal feature enhancement and fusion. It uses a compilation and debugging method to establish alignment relationships between source code statements and assembly instructions, as well as between source code variables and assembly code registers. By generating bimodal program slices using a cross-slicing method based on alignment relationships and program slicing technology, the method captures fine-grained semantic correlation between source code and assembly code with a cross-modal feature enhanced code representation learning model utilizing co-attention mechanisms. Vulnerability detection is then achieved through feature level fusion of semantic features captured in fine-grained aligned source code and assembly code.
Software vulnerability detection is critical to computer security. Most existing vulnerability detection methods use single modal-based vulnerability detection models, which cannot effectively extract cross-modal features. To solve this problem, we propose a new multimodal deep learning based vulnerability detection method through a cross-modal feature enhancement and fusion. Firstly, we utilize a special compilation and debugging method to obtain the alignment relationship between source code statements and assembly instructions, as well as between source code variables and assembly code registers. Based on this alignment relationship and program slicing technology, we propose a cross-slicing method to generate bimodal program slices. Then, we propose a cross-modal feature enhanced code representation learning model to capture the fine-grained semantic correlation between source code and assembly code by using the co-attention mechanisms. Finally, vulnerability detection is achieved by feature level fusion of semantic features captured in fine-grained aligned source code and assembly code. Extensive experiments show that our method improves the performance of vulnerability detection compared with state-of-the-art methods. Specifically, our method achieves an accuracy of 97.4% and an F1-measure of 93.4% on the SARD dataset. An average accuracy of 95.4% and an F1-measure of 89.1% on two real-world software projects (i.e., FFmpeg and OpenSSL) is also achieved by our method, improving over SOTA method 4.5% and 2.9%. (c) 2023 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available