4.7 Article

Infant birth weight estimation and low birth weight classification in United Arab Emirates using machine learning algorithms

Journal

SCIENTIFIC REPORTS
Volume 12, Issue 1, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41598-022-14393-6

Keywords

-

Funding

  1. Zayed Center for Health Sciences, United Arab Emirates University [31R239-12R080]

Ask authors/readers for more resources

Accurate prediction of newborn birth weight is crucial for evaluating their health and safety. This study provides a detailed approach for weight estimation and low birth weight classification. Multiple subsets of features and feature selection techniques were used, along with synthetic minority oversampling, to improve classification performance. The results demonstrate that the Random Forest algorithm performs best for weight estimation, while Logistic Regression with SMOTE oversampling achieves the best performance in low birth weight classification.
Accurate prediction of a newborn's birth weight (BW) is a crucial determinant to evaluate the newborn's health and safety. Infants with low BW (LBW) are at a higher risk of serious short- and long-term health outcomes. Over the past decade, machine learning (ML) techniques have shown a successful breakthrough in the field of medical diagnostics. Various automated systems have been proposed that use maternal features for LBW prediction. However, each proposed system uses different maternal features for LBW classification and estimation. Therefore, this paper provides a detailed setup for BW estimation and LBW classification. Multiple subsets of features were combined to perform predictions with and without feature selection techniques. Furthermore, the synthetic minority oversampling technique was employed to oversample the minority class. The performance of 30 ML algorithms was evaluated for both infant BW estimation and LBW classification. Experiments were performed on a self-created dataset with 88 features. The dataset was obtained from 821 women from three hospitals in the United Arab Emirates. Different performance metrics, such as mean absolute error and mean absolute percent error, were used for BW estimation. Accuracy, precision, recall, F-scores, and confusion matrices were used for LBW classification. Extensive experiments performed using five-folds cross validation show that the best weight estimation was obtained using Random Forest algorithm with mean absolute error of 294.53 g while the best classification performance was obtained using Logistic Regression with SMOTE oversampling techniques that achieved accuracy, precision, recall and F1 score of 90.24%, 87.6%, 90.2% and 0.89, respectively. The results also suggest that features such as diabetes, hypertension, and gestational age, play a vital role in LBW classification.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available