4.6 Article

Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge

Journal

JOURNAL OF BIOMEDICAL INFORMATICS
Volume 58, Issue -, Pages S120-S127

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2015.06.030

Keywords

Information extraction; Clinical natural language processing; Text mining

Funding

  1. NLM R00 Grant [LM011389]

Ask authors/readers for more resources

This paper describes the use of an agile text mining platform (Linguamatics' Interactive Information Extraction Platform, 12E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system. (C) 2015 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available