期刊
IEEE ACCESS
卷 8, 期 -, 页码 150672-150684出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2020.3016774
关键词
Software; Security; Natural language processing; Syntactics; Predictive models; Software metrics; Machine learning; AST; machine learning; source code; vulnerability prediction
资金
- Scientific and Technological Research Council of Turkey through the 1515 Frontier Research and Development Laboratories Support Program [5169902]
As the role of information and communication technologies gradually increases in our lives, software security becomes a major issue to provide protection against malicious attempts and to avoid ending up with noncompensable damages to the system. With the advent of data-driven techniques, there is now a growing interest in how to leverage machine learning (ML) as a software assurance method to build trustworthy software systems. In this study, we examine how to predict software vulnerabilities from source code by employing ML prior to their release. To this end, we develop a source code representation method that enables us to perform intelligent analysis on the Abstract Syntax Tree (AST) form of source code and then investigate whether ML can distinguish vulnerable and nonvulnerable code fragments. To make a comprehensive performance evaluation, we use a public dataset that contains a large amount of function-level real source code parts mined from open-source projects and carefully labeled according to the type of vulnerability if they have any.We show the effectiveness of our proposed method for vulnerability prediction from source code by carrying out exhaustive and realistic experiments under different regimes in comparison with state-of-art methods.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据