4.6 Article

Biomedical named entity recognition with the combined feature attention and fully-shared multi-task learning

Journal

BMC BIOINFORMATICS
Volume 23, Issue 1, Pages -

Publisher

BMC
DOI: 10.1186/s12859-022-04994-3

Keywords

Named entity recognition; Biomedical text mining; Syntactic information; Multi-task learning; Attention

Funding

  1. Ministry of Science and Technology, ROC [109-2221-E-468-014-MY3]

Ask authors/readers for more resources

This study introduces a novel fully-shared multi-task learning model based on a pre-trained language model in the biomedical domain, which achieved significant performance improvements on seven benchmark BioNER datasets compared to single-task models.
Background Biomedical named entity recognition (BioNER) is a basic and important task for biomedical text mining with the purpose of automatically recognizing and classifying biomedical entities. The performance of BioNER systems directly impacts downstream applications. Recently, deep neural networks, especially pre-trained language models, have made great progress for BioNER. However, because of the lack of high-quality and large-scale annotated data and relevant external knowledge, the capability of the BioNER system remains limited. Results In this paper, we propose a novel fully-shared multi-task learning model based on the pre-trained language model in biomedical domain, namely BioBERT, with a new attention module to integrate the auto-processed syntactic information for the BioNER task. We have conducted numerous experiments on seven benchmark BioNER datasets. The proposed best multi-task model obtains F1 score improvements of 1.03% on BC2GM, 0.91% on NCBI-disease, 0.81% on Linnaeus, 1.26% on JNLPBA, 0.82% on BC5CDR-Chemical, 0.87% on BC5CDR-Disease, and 1.10% on Species-800 compared to the single-task BioBERT model. Conclusion The results demonstrate our model outperforms previous studies on all datasets. Further analysis and case studies are also provided to prove the importance of the proposed attention module and fully-shared multi-task learning method used in our model.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available