☆ 4.6 Article

Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases

BMC BIOINFORMATICS (2011)

期刊

BMC BIOINFORMATICS

卷 12, 期 -, 页码 -

出版社

BMC

DOI: 10.1186/1471-2105-12-S8-S8

关键词

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Mathematical & Computational Biology

资金

National Institutes of Health National Center for Research Resources [R01RR024031]
Biotechnology and Biological Sciences Research Council [BB/F010486/1]
Canadian Institutes of Health Research [FRN 82940]
European Commission [2007-223411]
Royal Society
Scottish Universities Life Sciences Alliance
National Institutes of Health [1R01RR024031]
Italian Association for Cancer Research (AIRC)
Telethon
EU ENFIN [LSHG-CT-2005-518254]
PSIMEX [223411]
Biotechnology and Biological Sciences Research Council [BB/F010486/1] Funding Source: researchfish
BBSRC [BB/F010486/1] Funding Source: UKRI
Grants-in-Aid for Scientific Research [23700159] Funding Source: KAKEN

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Background: The vast amount of data published in the primary biomedical literature represents a challenge for the automated extraction and codification of individual data elements. Biological databases that rely solely on manual extraction by expert curators are unable to comprehensively annotate the information dispersed across the entire biomedical literature. The development of efficient tools based on natural language processing (NLP) systems is essential for the selection of relevant publications, identification of data attributes and partially automated annotation. One of the tasks of the Biocreative 2010 Challenge III was devoted to the evaluation of NLP systems developed to identify articles for curation and extraction of protein-protein interaction (PPI) data. Results: The Biocreative 2010 competition addressed three tasks: gene normalization, article classification and interaction method identification. The BioGRID and MINT protein interaction databases both participated in the generation of the test publication set for gene normalization, annotated the development and test sets for article classification, and curated the test set for interaction method classification. These test datasets served as a gold standard for the evaluation of data extraction algorithms. Conclusion: The development of efficient tools for extraction of PPI data is a necessary step to achieve full curation of the biomedical literature. NLP systems can in the first instance facilitate expert curation by refining the list of candidate publications that contain PPI data; more ambitiously, NLP approaches may be able to directly extract relevant information from full-text articles for rapid inspection by expert curators. Close collaboration between biological databases and NLP systems developers will continue to facilitate the long-term objectives of both disciplines.

Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases

期刊

BMC BIOINFORMATICS

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases

期刊

BMC BIOINFORMATICS

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文