☆ 4.7 Article

Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort

INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS (2023)

期刊

INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS

卷 170, 期 -, 页码 -

出版社

ELSEVIER IRELAND LTD

DOI: 10.1016/j.ijmedinf.2022.104932

关键词

Machine learning; Liver fibrosis; Imbalanced dataset; Oversampling techniques; NHANES

类别

Computer Science, Information Systems Health Care Sciences & Services Medical Informatics

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study developed a machine learning algorithm to diagnose liver fibrosis in the general US population. By utilizing data processing and feature selection, a model capable of identifying liver fibrosis was obtained and its feasibility was confirmed in a subset of participants.

Background: The progress of digital transformation in clinical practice opens the door to transforming the current clinical line for liver disease diagnosis from a late-stage diagnosis approach to an early-stage based one. Early diagnosis of liver fibrosis can prevent the progression of the disease and decrease liver-related morbidity and mortality. We developed here a machine learning (ML) algorithm containing standard parameters that can identify liver fibrosis in the general US population. Materials and methods: Starting from a public database (National Health and Nutrition Examination Survey, NHANES), representative of the American population with 7265 eligible subjects (control population n = 6828, with Fibroscan values E < 9.7 KPa; target population n = 437 with Fibroscan values E >= 9.7 KPa), we set up an SVM algorithm able to discriminate for individuals with liver fibrosis among the general US population. The algorithm set up involved the removal of missing data and a sampling optimization step to managing the data imbalance (only similar to 5 % of the dataset is the target population). Results: For the feature selection, we performed an unbiased analysis, starting from 33 clinical, anthropometric, and biochemical parameters regardless of their previous application as biomarkers of liver diseases. Through PCA analysis, we identified the 26 more significant features and then used them to set up a sampling method on an SVM algorithm. The best sampling technique to manage the data imbalance was found to be oversampling through the SMOTE-NC. For final model validation, we utilized a subset of 300 individuals (150 with liver fibrosis and 150 controls), subtracted from the main dataset prior to sampling. Performances were evaluated on multiple independent runs. Conclusions: We provide proof of concept of an ML clinical decision support tool for liver fibrosis diagnosis in the general US population. Though the presented ML model represents at this stage only a prototype, in the future, it might be implemented and potentially applied to program broad screenings for liver fibrosis.

Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort

期刊

INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS

出版社

ELSEVIER IRELAND LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort

期刊

INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS

出版社

ELSEVIER IRELAND LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文