4.7 Article

Using machine learning methods to predict hepatic encephalopathy in cirrhotic patients with unbalanced data

Journal

Publisher

ELSEVIER IRELAND LTD
DOI: 10.1016/j.cmpb.2021.106420

Keywords

Cost sensitivity; Hepatic encephalopathy; Disease risk prediction; Weighted random forest; Weighted support vector machine

Funding

  1. National Natural Science Foundation of China [81872714, 82173631]
  2. Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment [201805D111006]
  3. Shanxi Province Science and Technology Achievements Transformation Guide [201604D132042]

Ask authors/readers for more resources

This study developed a risk prediction model for liver cirrhosis complicated by HE using machine learning algorithms. The WRF model showed better performance in predicting the incidence of HE, assisting clinicians in identifying high-risk patients.
Objective: Hepatic encephalopathy (HE) is among the most common complications of cirrhosis. Data for cirrhosis with HE is typically unbalanced. Traditional statistical methods and machine learning algorithms thus cannot identify a few classes. In this paper, we use machine learning algorithms to construct a risk prediction model for liver cirrhosis complicated by HE to improve the efficiency of its prediction. Method: We collected medical data from 1,256 patients with cirrhosis and performed preprocessing to extract 81 features from these irregular data. To predict HE in cirrhotic patients, we compared several classification methods: logistic regression, weighted random forest (WRF), SVM, and weighted SVM (WSVM). We also used an additional 722 patients with cirrhosis for external validation of the model. Results: The WRF, WSVM, and logistic regression models exhibited better recognition ability for patients with HE than traditional machine learning models (sensitivity > 0.70), but their ability to identify patients with uncomplicated HE was slightly lower (specificity approximately 85%). The comprehensive evaluation index of the traditional model was higher than those of other models (G-means > 0.80 and F-measure > 0.40). For the WRF, the G-means (0.82), F-measure (0.46), and AUC (0.82) were superior to those of the logistic regression and WSVM models, which means that it can better predict the incidence of HE in patients. Conclusion: The WRF model is more suitable for the classification of unbalanced medical data and can be used to construct a risk prediction and evaluation system for liver cirrhosis complicated with HE. The probabilistic prediction models of WRF can help clinicians identify high-risk patients with HE. (c) 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available