☆ 4.4 Article Proceedings Paper

Chronic Kidney Disease stratification using office visit records: Handling data imbalance via hierarchical meta-classification

BMC MEDICAL INFORMATICS AND DECISION MAKING (2018)

期刊

BMC MEDICAL INFORMATICS AND DECISION MAKING

卷 18, 期 -, 页码 -

出版社

BMC

DOI: 10.1186/s12911-018-0675-x

关键词

Imbalanced data; Meta-classification; Hierarchical classification; Electronic health records; Kidney disease

类别

Medical Informatics

资金

NIGMS IDeA [U54-GM104941, P20 GM103446]
NSF IIS EAGER [1650851]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

BackgroundChronic Kidney Disease (CKD) is one of several conditions that affect a growing percentage of the US population; the disease is accompanied by multiple co-morbidities, and is hard to diagnose in-and-of itself. In its advanced forms it carries severe outcomes and can lead to death. It is thus important to detect the disease as early as possible, which can help devise effective intervention and treatment plan.Here we investigate ways to utilize information available in electronic health records (EHRs) from regular office visits of more than 13,000 patients, in order to distinguish among several stages of the disease. While clinical data stored in EHRs provide valuable information for risk-stratification, one of the major challenges in using them arises from data imbalance. That is, records associated with a more severe condition are typically under-represented compared to those associated with a milder manifestation of the disease. To address imbalance, we propose and develop a sampling-based ensemble approach, hierarchical meta-classification, aiming to stratify CKD patients into severity stages, using simple quantitative non-text features gathered from standard office visit records.MethodsThe proposed hierarchical meta-classification method frames the multiclass classification task as a hierarchy of two subtasks. The first is binary classification, separating records associated with the majority class from those associated with all minority classes combined, using meta-classification. The second subtask separates the records assigned to the combined minority classes into the individual constituent classes.ResultsThe proposed method identifies a significant proportion of patients suffering from the more advanced stages of the condition, while also correctly identifying most of the less severe cases, maintaining high sensitivity, specificity and F-measure ( 93%). Our results show that the high level of performance attained by our method is preserved even when the size of the training set is significantly reduced, demonstrating the stability and generalizability of our approach.ConclusionWe present a new approach to perform classification while addressing data imbalance, which is inherent in the biomedical domain. Our model effectively identifies severity stages of CKD patients, using information readily available in office visit records within the realistic context of high data imbalance.

Chronic Kidney Disease stratification using office visit records: Handling data imbalance via hierarchical meta-classification

期刊

BMC MEDICAL INFORMATICS AND DECISION MAKING

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Chronic Kidney Disease stratification using office visit records: Handling data imbalance via hierarchical meta-classification

期刊

BMC MEDICAL INFORMATICS AND DECISION MAKING

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文