4.6 Article

Exploring the boundaries: gene and protein identification in biomedical text

期刊

BMC BIOINFORMATICS
卷 6, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/1471-2105-6-S1-S5

关键词

-

向作者/读者索取更多资源

Background: Good automatic information extraction tools offer hope for automatic processing of the exploding biomedical literature, and successful named entity recognition is a key component for such tools. Methods: We present a maximum-entropy based system incorporating a diverse set of features for identifying gene and protein names in biomedical abstracts. Results: This system was entered in the BioCreative comparative evaluation and achieved a precision of 0.83 and recall of 0.84 in the open evaluation and a precision of 0.78 and recall of 0.85 in the closed evaluation. Conclusion: Central contributions are rich use of features derived from the training data at multiple levels of granularity, a focus on correctly identifying entity boundaries, and the innovative use of several external knowledge sources including full MEDLINE abstracts and web searches.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据