4.7 Article

HGSORF: Henry Gas Solubility Optimization-based Random Forest for C-Section prediction and XAI-based cause analysis

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 147, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2022.105671

Keywords

Cesarean section; Machine learning; Hyperparameter optimization; ADASYN; HGSORF; XAI; SHAP; LIME

Funding

  1. Deanship of Scien-tific Research at King Saud University [RG-1441-394]

Ask authors/readers for more resources

A stable predictive model is crucial for accurately forecasting cesarean delivery. To improve the accuracy of prediction, a Henry gas solubility optimization-based random forest model has been proposed. The model achieved superior performance and was explained using explainable artificial intelligence tools.
A stable predictive model is essential for forecasting the chances of cesarean or C-section (CS) delivery, as unnecessary CS delivery can adversely affect neonatal, maternal, and pediatric morbidity and mortality, and can incur significant financial burdens. Limited state-of-the-art machine learning models have been applied in this area in recent years, and the current models are insufficient to correctly predict the probability of CS delivery. To alleviate this drawback, we have proposed a Henry gas solubility optimization (HGSO)-based random forest (RF), with an improved objective function, called HGSORF, for the classification of CS and non CS classes. Real-world CS datasets can be noisy, such as the Pakistan Demographic and Health Survey (PDHS) dataset used in this study. The HGSO can provide fine-tuned hyperparameters of RF by avoiding local minima points. To compare performance, Gaussian Naive Bayes (GNB), linear discriminant analysis (LDA), K-nearest neighbors (KNN), gradient boosting classifier (GBC), and logistic regression (LR) have been considered in this research. The ADAptive SYNthetic (ADASYN) algorithm has been used to balance the model, and the proposed HGSORF has been compared with other classifiers as well as with other studies. The superior performance was achieved by HGSORF with an accuracy of 98.33% for the PDHS dataset. The hyperparameters of RF have also been optimized by using commonly used hyperparameter-optimization algorithms, and the proposed HGSORF provided comparatively better performance. Additionally, to analyze the causes of CS and their significance, the HGSORF is explained locally and globally using eXplainable artificial intelligence (XAI)-based tools such as SHapely Additive exPlanation (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). A decision support system has been developed as a potential application to support clinical staffs. All pre-trained models and relevant codes are available on: https://github.com/MIrazul29/HGSORF_CSection.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available