☆ 4.5 Article

An empirical study of optimization bugs in GCC and LLVM

JOURNAL OF SYSTEMS AND SOFTWARE (2021)

期刊

JOURNAL OF SYSTEMS AND SOFTWARE

卷 174, 期 -, 页码 -

出版社

ELSEVIER SCIENCE INC

DOI: 10.1016/j.jss.2020.110884

关键词

Empirical study; Compiler reliability; Bug characteristics; Compiler optimization bugs; Compiler testing

类别

Computer Science, Software Engineering Computer Science, Theory & Methods

资金

National Natural Science Foundation of China [61772107, 61722202, 62032004, 62072068]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Optimization bugs are significant in compilers, with value range propagation and instruction combine optimizations being the most buggy in GCC and LLVM respectively. Most optimization bugs are misoptimization bugs, and they tend to persist for long periods before being fixed, usually involving modifying a small number of lines of code. Our study aims to provide guidance for developers and researchers in designing compiler optimizations and suggests the need for more effective techniques and tools for testing compiler optimizations.

Optimizations are the fundamental component of compilers. Bugs in optimizations have significant impacts, and can cause unintended application behavior and disasters, especially for safety-critical domains. Thus, an in-depth analysis of optimization bugs should be conducted to help developers understand and test the optimizations in compilers. To this end, we conduct an empirical study to investigate the characteristics of optimization bugs in two mainstream compilers, GCC and LLVM. We collect about 57K and 22K bugs of GCC and LLVM, and then exhaustively examine 8,771 and 1,564 optimization bugs of the two compilers, respectively. The results reveal the following five characteristics of optimization bugs: (1) Optimizations are the buggiest component in both compilers except for the C++ component; (2) the value range propagation optimization and the instruction combine optimization are the buggiest optimizations in GCC and LLVM, respectively; the loop optimizations in both GCC and LLVM are more bug-prone than other optimizations; (3) most of the optimization bugs in both GCC and LLVM are misoptimization bugs, accounting for 57.21% and 61.38% respectively; (4) on average, the optimization bugs live over five months, and developers take 11.16 months for GCC and 13.55 months for LLVM to fix an optimization bug; in both GCC and LLVM, many confirmed optimization bugs have lived for a long time; (5) the bug fixes of optimization bugs involve no more than two files and three functions on average in both compilers, and around 99% of them modify no more than 100 lines of code, while 90% less than 50 lines of code. Our study provides a deep understanding of optimization bugs for developers and researchers. This could provide useful guidance for the developers and researchers to better design the optimizations in compilers. In addition, the analysis results suggest that we need more effective techniques and tools to test compiler optimizations. Moreover, our findings are also useful to the research of automatic debugging techniques for compilers, such as automatic compiler bug isolation techniques. (C) 2020 Elsevier Inc. All rights reserved.

An empirical study of optimization bugs in GCC and LLVM

期刊

JOURNAL OF SYSTEMS AND SOFTWARE

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

An empirical study of optimization bugs in GCC and LLVM

期刊

JOURNAL OF SYSTEMS AND SOFTWARE

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文