4.7 Article

Establishment of noninvasive diabetes risk prediction model based on tongue features and machine learning techniques

Journal

Publisher

ELSEVIER IRELAND LTD
DOI: 10.1016/j.ijmedinf.2021.104429

Keywords

Chinese medicine; Tongue diagnosis; Diabetes; High blood glucose; Critical blood glucose; Non-invasive; Machine learning; Risk prediction

Funding

  1. National Key Research and Development Program of China [2017YFC1703300, 2017YFC1703301]
  2. National Natural Science Foundation of China [81873235, 81973750, 81904094, BWS17J028]

Ask authors/readers for more resources

The study aimed to establish a predictive model for evaluating individuals with high blood glucose levels using a combination of TCM tongue diagnosis and machine learning techniques. The results showed that the risk prediction model had high accuracy and performance in both the test and validation sets, with Stacking and ResNet50 models recommended for their non-invasive operation, ease of use, high precision, low false positive rate, and low misdiagnosis rate in detecting hyperglycemia and critical blood glucose levels.
Background: Diabetes is a chronic noncommunicable disease with high incidence rate. Diabetics without early diagnosis or standard treatment may contribute to serious multisystem complications, which can be life threatening. Timely detection and intervention of prediabetes is very important to prevent diabetes, because it is inevitable in the development and progress of the disease. Objective: Our objective was to establish the predictive model that can be applied to evaluate people with blood glucose in high and critical state. Methods: We established the diabetes risk prediction model formed by a combined TCM tongue diagnosis with machine learning techniques. 1512 subjects were recruited from the hospital. After data preprocessing, we got the dataset 1 and dataset 2. Dataset 1 was used to train classical machine learning model, while dataset 2 was used to train deep learning model. To evaluate the performance of the prediction model, we used Classification Accuracy(CA), Precision, Recall, F1-score, Precision-Recall curve(P-R curve), Area Under the Precision-Recall curve(AUPRC), Receiver Operating Characteristic curve(ROC curve), Area Under the Receiver Operating Characteristic curve(AUROC), then selected the best diabetes risk prediction model. Results: On the test set of dataset 1, the CA of non-invasive Stacking model was 71 %, micro average AUROC was 0.87, macro average AUROC was 0.84, and micro average AUPRC was 0.77. In the critical blood glucose group, the AUROC was 0.84, AUPRC was 0.67. In the high blood glucose group, AUROC was 0.87, AUPRC was 0.83. On the validation set of dataset 2, the CA of ResNet50 model was 69 %, micro average AUROC was 0.84, macro average AUROC was 0.83, and micro average AUPRC was 0.73. In the critical blood glucose group, AUROC was 0.88, AUPRC was 0.71. In the high blood glucose group, AUROC was 0.80, AUPRC was 0.76. On the test set of dataset 2, the CA of ResNet50 model was 65 %, micro average AUROC was 0.83, macro average AUROC was 0.82, and micro average AUPRC was 0.71. In the critical blood glucose group, the prediction of AUROC was 0.84, AUPRC was 0.60. In the high blood glucose group, AUROC was 0.87, AUPRC was 0.71. Conclusions: Tongue features can improve the prediction accuracy of the diabetes risk prediction model formed by classical machine learning model significantly. In addition to the excellent performance, Stacking model and ResNet50 model which were recommended had non-invasive operation and were easy to use. Stacking model and ResNet50 model had high precision, low false positive rate and low misdiagnosis rate on detecting hyperglycemia. While on detecting blood glucose value in critical state, Stacking model and ResNet50 model had a high sensitivity, a low false negative rate and a low missed diagnosis rate. The study had proved that the

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available