期刊
JOURNAL OF BIOMEDICAL INFORMATICS
卷 64, 期 -, 页码 179-191出版社
ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2016.10.005
关键词
Natural language processing; Information extraction; Reference resolution; Radiology report; Cancer stages; Liver cancer
资金
- National Institutes of Health, National Center for Advancing Translational Sciences [KL2 TR000421]
- UW Institute of Translational Health Sciences [UL1TR000423]
Background: Anaphoric references occur ubiquitously in clinical narrative text. However, the problem, still very much an open challenge, is typically less aggressively focused on in clinical text domain applications. Furthermore, existing research on reference resolution is often conducted disjointly from real world motivating tasks. Objective: In this paper, we present our machine-learning system that automatically performs reference resolution and a rule-based system to extract tumor characteristics, with component-based and end-to-end evaluations. Specifically, our goal was to build an algorithm that takes in tumor templates and outputs tumor characteristic, e.g. tumor number and largest tumor sizes, necessary for identifying patient liver cancer stage phenotypes. Results: Our reference resolution system reached a modest performance of 0.66 Fl for the averaged MUC, B-cubed, and CEAF scores for coreference resolution and 0.43 Fl for particularization relations. However, even this modest performance was helpful to increase the automatic tumor characteristics annotation substantially over no reference resolution. Conclusion: Experiments revealed the benefit of reference resolution even for relatively simple tumor characteristics variables such as largest tumor size. However we found that different overall variables had different tolerances to reference resolution upstream errors, highlighting the need to characterize systems by end-to-end evaluations. (C) 2016 Elsevier Inc. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据