4.6 Article

Coronary artery disease risk assessment from unstructured electronic health records using text mining

Journal

JOURNAL OF BIOMEDICAL INFORMATICS
Volume 58, Issue -, Pages S203-S210

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2015.08.003

Keywords

Coronary artery disease; Text mining; Framingham risk score; Temporal data; EHR

Funding

  1. National Institute of Health (NIH) [2U54LM008748, 1R13LM01141101]
  2. School of Public Health & Community Medicine
  3. Ingham Institute for Applied Medical Research
  4. UNSW Medicine
  5. SouthWest Sydney Local Health District
  6. Cancer Institute of New South Wales
  7. Prince of Wales Clinical School, UNSW Medicine

Ask authors/readers for more resources

Coronary artery disease (CAD) often leads to myocardial infarction, which may be fatal. Risk factors can be used to predict CAD, which may subsequently lead to prevention or early intervention. Patient data such as co-morbidities, medication history, social history and family history are required to determine the risk factors for a disease. However, risk factor data are usually embedded in unstructured clinical narratives if the data is not collected specifically for risk assessment purposes. Clinical text mining can be used to extract data related to risk factors from unstructured clinical notes. This study presents methods to extract Framingham risk factors from unstructured electronic health records using clinical text mining and to calculate 10-year coronary artery disease risk scores in a cohort of diabetic patients. We developed a rule-based system to extract risk factors: age, gender, total cholesterol, HDL-C, blood pressure, diabetes history and smoking history. The results showed that the output from the text mining system was reliable, but there was a significant amount of missing data to calculate the Framingham risk score. A systematic approach for understanding missing data was followed by implementation of imputation strategies. An analysis of the 10-year Framingham risk scores for coronary artery disease in this cohort has shown that the majority of the diabetic patients are at moderate risk of CAD. (C) 2015 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available