4.8 Article

Fusion of Multi-RSMOTE With Fuzzy Integral to Classify Bug Reports With an Imbalanced Distribution

Journal

IEEE TRANSACTIONS ON FUZZY SYSTEMS
Volume 27, Issue 12, Pages 2406-2420

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TFUZZ.2019.2899809

Keywords

Computer bugs; Uncertainty; Standards; Prediction algorithms; Software systems; Software reliability; Bug report identification; class imbalance; fuzzy integral; software quality

Funding

  1. National Natural Science Foundation of China [61672122, 61602077, 61772344, 61732011]
  2. Public Welfare Funds for Scientific Research of Liaoning Province of China [20170005]
  3. Natural Science Foundation of Liaoning Province of China [20170540097]
  4. Fundamental Research Funds for the Central Universities [3132016348]

Ask authors/readers for more resources

With the help of automated classification, severe bugs can be rapidly identified so that the latent damage to software projects can be minimized. However, bug report datasets commonly suffer from disproportionate number of category samples. When presented with the situation of class imbalance, most standard classification learning approaches fail to properly learn the distributive characteristics of the samples and tend to result in unfavorable performance to predict class label. In this case, imbalanced learning becomes critical to advance classification algorithms. In this paper, we propose an improved synthetic minority oversampling technique to avoid the degraded performance caused by class imbalance in bug report datasets. Moreover, to lessen the chance of occasionalities in random sampling process, we propose a repeated sampling technique to train different, but related classifiers. Finally, an ensemble algorithm based on Choquet fuzzy integral is employed to combine the wisdom of crowds and make better decisions. We conduct comprehensive experiments on several bug report datasets from real-world bug repositories. The results demonstrate that the proposed method boosts the classification performance across the classes of the data. Specifically, compared with various ensemble learning techniques, the Choquet fuzzy integral achieves outstanding results on integrating multiple random oversampling techniques.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available