4.5 Article

Building an efficient curation workflow for the Arabidopsis literature corpus

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/database/bas047

Keywords

-

Funding

  1. National Science Foundation [DBI-0850219]
  2. National Institutes of Health National Human Genome Research Institute (NHGRI) [5P41HG002273-09]
  3. TAIR
  4. Direct For Biological Sciences
  5. Div Of Biological Infrastructure [0850219] Funding Source: National Science Foundation

Ask authors/readers for more resources

TAIR (The Arabidopsis Information Resource) is the model organism database (MOD) for Arabidopsis thaliana, a model plant with a literature corpus of about 39 000 articles in PubMed, with over 4300 new articles added in 2011. We have developed a literature curation workflow incorporating both automated and manual elements to cope with this flood of new research articles. The current workflow can be divided into two phases: article selection and curation. Structured controlled vocabularies, such as the Gene Ontology and Plant Ontology are used to capture free text information in the literature as succinct ontology-based annotations suitable for the application of computational analysis methods. We also describe our curation platform and the use of text mining tools in our workflow.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available