☆ 4.5 Article

Improving links between literature and biological data with text mining: a case study with GEO, PDB and MEDLINE

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION (2012)

期刊

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION

卷 -, 期 -, 页码 -

出版社

OXFORD UNIV PRESS

DOI: 10.1093/database/bas026

关键词

类别

Mathematical & Computational Biology

资金

Intramural Research Program of the National Institutes of Health
National Library of Medicine [LM000002-01]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

High-throughput experiments and bioinformatics techniques are creating an exploding volume of data that are becoming overwhelming to keep track of for biologists and researchers who need to access, analyze and process existing data. Much of the available data are being deposited in specialized databases, such as the Gene Expression Omnibus (GEO) for microarrays or the Protein Data Bank (PDB) for protein structures and coordinates. Data sets are also being described by their authors in publications archived in literature databases such as MEDLINE and PubMed Central. Currently, the curation of links between biological databases and the literature mainly relies on manual labour, which makes it a time-consuming and daunting task. Herein, we analysed the current state of link curation between GEO, PDB and MEDLINE. We found that the link curation is heterogeneous depending on the sources and databases involved, and that overlap between sources is low, <50% for PDB and GEO. Furthermore, we showed that text-mining tools can automatically provide valuable evidence to help curators broaden the scope of articles and database entries that they review. As a result, we made recommendations to improve the coverage of curated links, as well as the consistency of information available from different databases while maintaining high-quality curation. Database URLs: http://www.ncbi.nlm.nih.gov/PubMed,http://www.ncbi.nlm.nih.gov/geo/,http://www.rcsb.org/pdb/

Improving links between literature and biological data with text mining: a case study with GEO, PDB and MEDLINE

期刊

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving links between literature and biological data with text mining: a case study with GEO, PDB and MEDLINE

期刊

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文