☆ 4.6 Article

Semi-Supervised Deep Fuzzy C-Mean Clustering lefor Software Fault Prediction

IEEE ACCESS (2018)

期刊

IEEE ACCESS

卷 6, 期 -, 页码 25675-25685

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2018.2835304

关键词

Semi-supervised learning; fuzzy C-Mean clustering; feature learning; software fault prediction

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

资金

National Basic Research Program (973 Program) of China [2013CB329402]
National Natural Science Foundation of China [61573267, 61473215, 61571342, 61572383, 61501353, 61502369, 61271302, 61272282, 61202176]
Fund for Foreign Scholars in University Research and Teaching Programs (111 Project) [B07048]
Major Research Plan of the National Natural Science Foundation of China [91438201, 91438103]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Software fault prediction is a consequential research area in software quality promise. In this paper, we propose a semi-supervised deep fuzzy C-mean (DFCM) clustering for software fault prediction, which is the cumulation of semi-supervised DFCM clustering and feature compression techniques. Deep is utilized for the feature-based multi clusters of unlabeled and labeled data sets along with their labeled classes. In our approach, for the training model, we simultaneously deal with the unsupervised data and supervised data to exploit the obnubilated information from unlabeled data to labeled data to support the construction of the precise model. We utilize DFCM clustering to handle the class imbalance problem and withal fuzzy theory logic is very akin to human logic and it is facile to comprehend. We further ameliorate the prediction performance with the coalescence of feature learning techniques-feature extraction and feature selection (using random-under sampling) to generate good features and remove irrelevant and redundant features to reduce the noisy data for classification. However, by the performance of the model results, the amalgamation of deep multi clusters and feature techniques work better due to their ability to identify and amalgamation essential information in data feature. The classification model is predicted on the maximum homogeneous between the features of labeled and unlabeled data, the model is trained on the un-noisy data set obtained by the deep coalescence of multi clusters and feature techniques. To check the efficacy of our approach, we chose data sets from real-world software project (NASA & Eclipse), and then we compared our approach with a number of latest classical base-line methods, and investigate the performance by using performance measures such as probability of detection, F-measure, and area under the curve.

Semi-Supervised Deep Fuzzy C-Mean Clustering lefor Software Fault Prediction

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Semi-Supervised Deep Fuzzy C-Mean Clustering lefor Software Fault Prediction

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文