4.7 Article

An evidential reasoning rule based feature selection for improving trauma outcome prediction

Journal

APPLIED SOFT COMPUTING
Volume 103, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.asoc.2021.107112

Keywords

Feature selection; Trauma; Evidential reasoning rule; Random forest; ReliefF; Imbalance classes

Funding

  1. Saudi Arabian Government
  2. EU project, United Kingdom [823759]
  3. NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Informatization, United Kingdom [U1709215]

Ask authors/readers for more resources

Key features for accurately predicting patient outcomes can be selected through random forest, ReliefF, and evidential reasoning (ER) rule. The impact of outcome class imbalance on feature selection is discussed, with synthetic minority over-sampling technique (SMOTE) showing differences in selected features. The highest prediction performance is achieved by the ER rule for selecting features, followed by ReliefF and random forest.
Various demographic and medical factors can be linked to severe deterioration of patients suffering from traumatic injuries. Accurate identification of the most relevant variables is essential for building more accurate prediction models and making more rapid life-saving medical decision. The intention of this paper is to select a number of features that can be used to accurately predict patients' outcomes through three feature selection methods: random forest, ReliefF and the evidential reasoning (ER) rule. The impact of an outcome's class imbalance on feature selection is discussed, and synthetic minority over-sampling technique (SMOTE) is performed to show the differences in the selected features. The results show that length of stay in hospital, length of stay in intensive care unit, age, and Glasgow Coma Scale (GCS) are the most selected features across different techniques. The prediction models based on the features selected by the ER rule show the highest prediction performance represented by the area under the receiver operating characteristic curve (AUC) values, which has a median of 0.895 for the model employed by the ten highest-weighted variables, while the median AUC values are 0.827 and 0.885 if the ten highest-weighted variables are selected by ReliefF and random forest respectively. The results also show that after the ten most important features, increasing the number of the less important features has only a slight increase in prediction accuracy. (C) 2021 Published by Elsevier B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available