4.6 Article

A Review on Source Code Documentation

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3519312

Keywords

Summarization; software documentation; source code; deep learning; summary generation; name prediction

Ask authors/readers for more resources

This article examines current practices in code documentation and analyzes existing literature to provide insights into their preparedness in addressing the stated problem and the challenges ahead. The research community has primarily focused on method-level summarization, with deep learning dominating the field in the past five years. Researchers are regularly proposing larger corpora for source code documentation, with Java and Python being the widely used programming languages as the corpus. Bilingual Evaluation Understudy is the most favored evaluation metric among researchers.
Context: Coding is an incremental activity where a developer may need to understand a code before making suitable changes in the code. Code documentation is considered one of the best practices in software development but requires significant efforts from developers. Recent advances in natural language processing and machine learning have provided enough motivation to devise automated approaches for source code documentation at multiple levels. Objective: The review aims to study current code documentation practices and analyze the existing literature to provide a perspective on their preparedness to address the stated problem and the challenges that lie ahead. Methodology: We provide a detailed account of the literature in the area of automated source code documentation at different levels and critically analyze the effectiveness of the proposed approaches. This also allows us to infer gaps and challenges to address the problem at different levels. Findings: (1) The research community focused on method-level summarization. (2) Deep learning has dominated the past five years of this research field. (3) Researchers are regularly proposing bigger corpora for source code documentation. (4) Java and Python are the widely used programming languages as corpus. (5) Bilingual Evaluation Understudy is the most favored evaluation metric for the research persons.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available