4.2 Article Proceedings Paper

Detecting noisy instances with the ensemble filter: A study in software quality estimation

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD
DOI: 10.1142/S0218194006002677

关键词

software metrics; software quality; noise detection; ensemble filter

向作者/读者索取更多资源

The performance of a classification model is invariably affected by the characteristics of the measurement data it is built upon. If the quality of the data is generally poor, then the classification model will demonstrate poor performance. The detection and removal of noisy instances will improve quality of the data, and consequently, the performance of the classification model. We investigate a noise handling technique that attempts to improve the quality of datasets for classification purposes by eliminating instances that are likely to be noise. Our approach uses twenty five different classification techniques to create an ensemble filter for eliminating likely noise. The basic assumption is that if a given majority of classifiers in the ensemble misclassify an instance, then it is likely to be a noisy instance. Using a relatively large number of base-level classifiers in the ensemble filter facilitates in achieving the desired level of noise removal conservativeness with several possible levels of filtering. It also provides a higher degree of confidence in the noise elimination procedure as the results are less likely to get influenced by (possibly) inappropriate learning bias of a few algorithms with twenty five base-level classifiers than with relatively smaller number of base-level classifiers. Empirical case studies of two high assurance software projects demonstrates the effectiveness of our noise elimination approach by the significant improvement achieved in classification accuracies at various levels of noise filtering.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据