4.7 Article

Data mining a diabetic data warehouse

期刊

ARTIFICIAL INTELLIGENCE IN MEDICINE
卷 26, 期 1-2, 页码 37-54

出版社

ELSEVIER
DOI: 10.1016/S0933-3657(02)00051-9

关键词

data mining; diabetes; data mining software CART

向作者/读者索取更多资源

Diabetes is a major health problem in the United States. There is a long history of diabetic registries and databases with systematically collected patient information. We examine one such diabetic data warehouse, showing a method of applying data mining techniques, and some of the data issues, analysis problems, and results. The diabetic data warehouse is from a large integrated health care system in the New Orleans area with 30,383 diabetic patients. Methods for translating a complex relational database with time series and sequencing information to a flat file suitable for data mining are challenging. We discuss two variables in detail, a comorbidity index and the HgbA1c, a measure of glycemic control related to outcomes. We used the classification tree approach in Classification and Regression Trees (CART((R))) with a binary target variable of HgbA1c >9.5 and 10 predictors: age, sex, emergency department visits, office visits, comorbidity index, dyslipidemia, hypertension, cardiovascular disease, retinopathy, end-stage renal disease. Unexpectedly, the most important variable associated with bad glycemic control is younger age, not the comorbiditity index or whether patients have related diseases. If we want to target diabetics with bad HgbA1c values, the odds of finding them is 3.2 times as high in those <6.5 years of age than those older. Data mining can discover novel associations that are useful to clinicians and administrators. (C) 2002 Elsevier Science B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据