☆ 4.7 Article

Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE (2019)

期刊

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE

卷 182, 期 -, 页码 -

出版社

ELSEVIER IRELAND LTD

DOI: 10.1016/j.cmpb.2019.105055

关键词

Electronic health records; Incidence; Onset; Prediction; Type 2 diabetes mellitus; Wide and deep learning

类别

Computer Science, Interdisciplinary Applications Computer Science, Theory & Methods Engineering, Biomedical Medical Informatics

资金

NVIDIA Corporation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Objective: Diabetes is responsible for considerable morbidity, healthcare utilisation and mortality in both developed and developing countries. Currently, methods of treating diabetes are inadequate and costly so prevention becomes an important step in reducing the burden of diabetes and its complications. Electronic health records (EHRs) for each individual or a population have become important tools in understanding developing trends of diseases. Using EHRs to predict the onset of diabetes could improve the quality and efficiency of medical care. In this paper, we apply a wide and deep learning model that combines the strength of a generalised linear model with various features and a deep feed-forward neural network to improve the prediction of the onset of type 2 diabetes mellitus (T2DM). Materials and methods: The proposed method was implemented by training various models into a logistic loss function using a stochastic gradient descent. We applied this model using public hospital record data provided by the Practice Fusion EHRs for the United States population. The dataset consists of de-identified electronic health records for 9948 patients, of which 1904 have been diagnosed with T2DM. Prediction of diabetes in 2012 was based on data obtained from previous years (2009-2011). The imbalance class of the model was handled by Synthetic Minority Oversampling Technique (SMOTE) for each cross-validation training fold to analyse the performance when synthetic examples for the minority class are created. We used SMOTE of 150 and 30 0 percent, in which 300 percent means that three new synthetic instances are created for each minority class instance. This results in the approximated diabetes:non-diabetes distributions in the training set of 1:2 and 1:1, respectively. Results: Our final ensemble model not using SMOTE obtained an accuracy of 84.28%, area under the receiver operating characteristic curve (AUC) of 84.13%, sensitivity of 31.17% and specificity of 96.85%. Using SMOTE of 150 and 300 percent did not improve AUC (83.33% and 82.12%, respectively) but increased sensitivity (49.40% and 71.57%, respectively) with a moderate decrease in specificity (90.16% and 76.59%, respectively). Discussion and conclusions: Our algorithm has further optimised the prediction of diabetes onset using a novel state-of-the-art machine learning algorithm: the wide and deep learning neural network architecture. (C) 2019 Elsevier B.V. All rights reserved.

Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records

期刊

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE

出版社

ELSEVIER IRELAND LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records

期刊

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE

出版社

ELSEVIER IRELAND LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文