☆ 4.1 Article

Are our clone detectors good enough? An empirical study of code effects by obfuscation

CYBERSECURITY (2023)

期刊

CYBERSECURITY

卷 6, 期 1, 页码 -

出版社

SPRINGERNATURE

DOI: 10.1186/s42400-023-00148-x

关键词

Clone detection; Obfuscation; Evaluation

类别

Computer Science, Information Systems Computer Science, Interdisciplinary Applications Computer Science, Software Engineering

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Clone detection is crucial in various fields, but obfuscation by cyber criminals can hinder detection. Most previous studies only focused on the comparison between original code and its obfuscated version, ignoring the detection of obfuscated code from cloned code. This research evaluates the performance of deep learning-based and traditional clone detectors against obfuscated code, using a benchmark of 524,148 code pairs generated from different strategies of obfuscation. The findings shed light on the impact of obfuscation on clone detection and the differences between different clone detection methods.

Clone detection has received much attention in many fields such as malicious code detection, vulnerability hunting, and code copyright infringement detection. However, cyber criminals may obfuscate code to impede violation detection. To date, few studies have investigated the robustness of clone detectors, especially in-fashion deep learning-based ones, against obfuscation. Meanwhile, most of these studies only measure the difference between one code snippet and its obfuscation version. However, in reality, the attackers may modify the original code before obfuscating it. Then what we should evaluate is the detection of obfuscated code from cloned code, not the original code. For this, we conduct a comprehensive study evaluating 3 popular deep-learning based clone detectors and 6 commonly used traditional ones. Regarding the data, we collect 6512 clone pairs of five types from the dataset BigCloneBench and obfuscate one program of each pair via 64 strategies of 6 state-of-art commercial obfuscators. We also collect 1424 non-clone pairs to evaluate the false positives. In sum, a benchmark of 524,148 code pairs (either clone or not) are generated, which are passed to clone detectors for evaluation. To automate the evaluation, we develop one uniform evaluation framework, integrating the clone detectors and obfuscators. The results bring us interesting findings on how obfuscation affects the performance of clone detection and what is the difference between traditional and deep learning-based clone detectors. In addition, we conduct manual code reviews to uncover the root cause of the phenomenon and give suggestions to users from different perspectives.

Are our clone detectors good enough? An empirical study of code effects by obfuscation

期刊

CYBERSECURITY

出版社

SPRINGERNATURE

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Are our clone detectors good enough? An empirical study of code effects by obfuscation

期刊

CYBERSECURITY

出版社

SPRINGERNATURE

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文