4.5 Article

A graph-based clustering algorithm for software systems modularization

Journal

INFORMATION AND SOFTWARE TECHNOLOGY
Volume 133, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.infsof.2020.106469

Keywords

Graph clustering; Clustering algorithms; Software modularization; Software maintenance; Software comprehension; Software architecture; Architecture recovery

Ask authors/readers for more resources

By introducing a graph-based clustering algorithm named GMA, this paper presents a new modularization technique to better understand software system structures and software refactoring. Experimental results demonstrate that the algorithm produces a modularization closer to human expert's decomposition.
Context: Clustering algorithms, as a modularization technique, are used to modularize a program aiming to understand large software systems as well as software refactoring. These algorithms partition the source code of the software system into smaller and easy-to-manage modules (clusters). The resulting decomposition is called the software system structure (or software architecture). Due to the NP-hardness of the modularization problem, evolutionary clustering approaches such as the genetic algorithm have been used to solve this problem. These methods do not make much use of the information and knowledge available in the artifact dependency graph which is extracted from the source code. Objective: To overcome the limitations of the existing modularization techniques, this paper presents a new modularization technique named GMA (Graph-based Modularization Algorithm). Methods: In this paper, a new graph-based clustering algorithm is presented for software modularization. To this end, the depth of relationships is used to compute the similarity between artifacts, as well as seven new criteria are proposed to evaluate the quality of a modularization. The similarity presented in this paper enables the algorithm to use graph-theoretic information. Results: To demonstrate the applicability of the proposed algorithm, ten folders of Mozilla Firefox with different domains and functions, along with four other applications, are selected. The experimental results demonstrate that the proposed algorithm produces modularization closer to the human expert's decomposition (i.e., directory structure) than the other existing algorithms. Conclusion: The proposed algorithm is expected to help a software designer in the software reverse engineering process to extract easy-to-manage and understandable modules from source code.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available