4.6 Article

A Study on ML-Based Software Defect Detection for Security Traceability in Smart Healthcare Applications

期刊

SENSORS
卷 23, 期 7, 页码 -

出版社

MDPI
DOI: 10.3390/s23073470

关键词

machine learning; feature extraction; feature selection; ensemble learning; software defects prediction; software development life-cycle

向作者/读者索取更多资源

Software Defect Prediction (SDP) is an integral aspect of the Software Development Life-Cycle (SDLC). This article investigates various Machine Learning (ML) techniques and their impact on SDP, including feature extraction and selection techniques, as well as different ML algorithms. The results show that certain techniques, such as Principal Component Analysis (PCA) and Partial Least Squares Regression (PLS), can provide significant improvements in predicting software defects.
Software Defect Prediction (SDP) is an integral aspect of the Software Development Life-Cycle (SDLC). As the prevalence of software systems increases and becomes more integrated into our daily lives, so the complexity of these systems increases the risks of widespread defects. With reliance on these systems increasing, the ability to accurately identify a defective model using Machine Learning (ML) has been overlooked and less addressed. Thus, this article contributes an investigation of various ML techniques for SDP. An investigation, comparative analysis and recommendation of appropriate Feature Extraction (FE) techniques, Principal Component Analysis (PCA), Partial Least Squares Regression (PLS), Feature Selection (FS) techniques, Fisher score, Recursive Feature Elimination (RFE), and Elastic Net are presented. Validation of the following techniques, both separately and in combination with ML algorithms, is performed: Support Vector Machine (SVM), Logistic Regression (LR), Naive Bayes (NB), K-Nearest Neighbour (KNN), Multilayer Perceptron (MLP), Decision Tree (DT), and ensemble learning methods Bootstrap Aggregation (Bagging), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), Random Forest(RF), and Generalized Stacking (Stacking). Extensive experimental setup was built and the results of the experiments revealed that FE and FS can both positively and negatively affect performance over the base model or Baseline. PLS, both separately and in combination with FS techniques, provides impressive, and the most consistent, improvements, while PCA, in combination with Elastic-Net, shows acceptable improvement.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据