Journal
JOURNAL OF BIOMEDICAL INFORMATICS
Volume 58, Issue -, Pages S120-S127Publisher
ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2015.06.030
Keywords
Information extraction; Clinical natural language processing; Text mining
Funding
- NLM R00 Grant [LM011389]
Ask authors/readers for more resources
This paper describes the use of an agile text mining platform (Linguamatics' Interactive Information Extraction Platform, 12E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system. (C) 2015 Elsevier Inc. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available