3.8 Proceedings Paper

Exploring Regular Expression Comprehension

出版社

IEEE

关键词

Regular expression comprehension; equivalence class; regex representations

资金

  1. NSF [SHF-EAGER-1446932, SHF-1645136]
  2. Direct For Computer & Info Scie & Enginr
  3. Division of Computing and Communication Foundations [1645136, 1564162] Funding Source: National Science Foundation

向作者/读者索取更多资源

The regular expression (regex) is a powerful tool employed in a large variety of software engineering tasks. However, prior work has shown that regexes can be very complex and that it could be difficult for developers to compose and understand them. This work seeks to identify code smells that impact comprehension. We conduct an empirical study on 42 pairs of behaviorally equivalent but syntactically different regexes using 180 participants and evaluate the understandability of various regex language features. We further analyze regexes in GitHub to find the community standards or the common usages of various features. We found that some regex expression representations are more understandable than others. For example, using a range (e.g., [0-9]) is often more understandable than a default character class (e.g., [\d]). We also found that the DFA size of a regex significantly affects comprehension for the regexes studied. The larger the DFA of a regex (up to size eight), the more understandable it was. Finally, we identify smelly and non-smelly regex representations based on a combination of community standards and understandability metrics.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据