☆ 4.6 Article

Biomedical named entity recognition with the combined feature attention and fully-shared multi-task learning

BMC BIOINFORMATICS (2022)

Journal

BMC BIOINFORMATICS

Volume 23, Issue 1, Pages -

Publisher

BMC

DOI: 10.1186/s12859-022-04994-3

Keywords

Named entity recognition; Biomedical text mining; Syntactic information; Multi-task learning; Attention

Funding

Ministry of Science and Technology, ROC [109-2221-E-468-014-MY3]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study introduces a novel fully-shared multi-task learning model based on a pre-trained language model in the biomedical domain, which achieved significant performance improvements on seven benchmark BioNER datasets compared to single-task models.

Background Biomedical named entity recognition (BioNER) is a basic and important task for biomedical text mining with the purpose of automatically recognizing and classifying biomedical entities. The performance of BioNER systems directly impacts downstream applications. Recently, deep neural networks, especially pre-trained language models, have made great progress for BioNER. However, because of the lack of high-quality and large-scale annotated data and relevant external knowledge, the capability of the BioNER system remains limited. Results In this paper, we propose a novel fully-shared multi-task learning model based on the pre-trained language model in biomedical domain, namely BioBERT, with a new attention module to integrate the auto-processed syntactic information for the BioNER task. We have conducted numerous experiments on seven benchmark BioNER datasets. The proposed best multi-task model obtains F1 score improvements of 1.03% on BC2GM, 0.91% on NCBI-disease, 0.81% on Linnaeus, 1.26% on JNLPBA, 0.82% on BC5CDR-Chemical, 0.87% on BC5CDR-Disease, and 1.10% on Species-800 compared to the single-task BioBERT model. Conclusion The results demonstrate our model outperforms previous studies on all datasets. Further analysis and case studies are also provided to prove the importance of the proposed attention module and fully-shared multi-task learning method used in our model.

Biomedical named entity recognition with the combined feature attention and fully-shared multi-task learning

Journal

BMC BIOINFORMATICS

Publisher

BMC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Biomedical named entity recognition with the combined feature attention and fully-shared multi-task learning

Journal

BMC BIOINFORMATICS

Publisher

BMC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper