☆ 4.6 Article

Building a semantically annotated corpus of clinical texts

JOURNAL OF BIOMEDICAL INFORMATICS (2009)

Journal

JOURNAL OF BIOMEDICAL INFORMATICS

Volume 42, Issue 5, Pages 950-966

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE

DOI: 10.1016/j.jbi.2008.12.013

Keywords

Corpora; Semantic annotation; Clinical text; Natural language processing; Gold standards; Evaluation; Information extraction; Text mining; Temporal annotation; Annotation guidelines

Funding

UK Medical Research Council [RBI 06367]
Medical Research Council [G0300607] Funding Source: researchfish
MRC [G0300607] Funding Source: UKRI

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains. (C) 2009 Elsevier Inc. All rights reserved.

Building a semantically annotated corpus of clinical texts

Journal

JOURNAL OF BIOMEDICAL INFORMATICS

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Building a semantically annotated corpus of clinical texts

Journal

JOURNAL OF BIOMEDICAL INFORMATICS

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper