4.7 Article

Summarizing source code through heterogeneous feature fusion and extraction

期刊

INFORMATION FUSION
卷 103, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.inffus.2023.102058

关键词

Code summarization; Feature fusion; Heterogeneous graph; Graph neural network; Transformer

向作者/读者索取更多资源

Code summarization is crucial for software maintenance, aiming to generate concise natural-language descriptions summarizing the functionality of source code automatically. This paper proposes HetCoS to extract the syntactic and sequential features of source code by exploring its inherent heterogeneity for code summarization. Experimental results demonstrate the superiority of our approach over sixteen state-of-the-art baselines.
Code summarization, which seeks to automatically produce a succinct natural-language description to summarize the functionality of source code, plays an essential role in maintaining the software. Currently, plentiful approaches have been proposed to first encode the source code based on its Abstract Syntax Tree (AST), and then decode it into a textual summary. However, most existing works interpret the AST-based syntax structure as a homogeneous graph, without discriminating the different relations between graph nodes (e.g., the parent-child and sibling relations) in a heterogeneous way. To mitigate this issue, this paper proposes HetCoS to extract the syntactic and sequential features of source code by exploring its inherent heterogeneity for code summarization. Specifically, we first build a Heterogeneous Code Graph (HCG) that fuses the syntax structure and code sequence with eight types of edges/relations designed between graph nodes. Moreover, we present a heterogeneous graph neural network for capturing the diverse relations in HCG. The represented HCG is then fed into a Transformer decoder, followed by a multi-head attention-based copying mechanism to support high-quality summary generation. Extensive experiments on the major Java and Python datasets illustrate the superiority of our approach over sixteen state-of-the-art baselines. To promote reproducibility studies, we make the implementation of HetCoS publicly accessible at https://github.com/GJCEXP/HETCOS.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据