☆ 4.4 Article

Document-Level Biomedical Relation Extraction Leveraging Pretrained Self-Attention Structure and Entity Replacement: Algorithm and Pretreatment Method Validation Study

JMIR MEDICAL INFORMATICS (2020)

期刊

JMIR MEDICAL INFORMATICS

卷 8, 期 5, 页码 -

出版社

JMIR PUBLICATIONS, INC

DOI: 10.2196/17644

关键词

self-attention; document-level; relation extraction; biomedical entity pretreatment

类别

Medical Informatics

资金

Natural Science Foundation of Guangdong Province of China [2015A030308017]
National Natural Science Foundation of China [61976239]
Innovation Foundation of High-end Scientific Research Institutions of Zhongshan City of China [2019AG031]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Background: The most current methods applied for intrasentence relation extraction in the biomedical literature are inadequate for document-level relation extraction, in which the relationship may cross sentence boundaries. Hence, some approaches have been proposed to extract relations by splitting the document-level datasets through heuristic rules and learning methods. However, these approaches may introduce additional noise and do not really solve the problem of intersentence relation extraction. It is challenging to avoid noise and extract cross-sentence relations. Objective: This study aimed to avoid errors by dividing the document-level dataset, verify that a self-attention structure can extract biomedical relations in a document with long-distance dependencies and complex semantics, and discuss the relative benefits of different entity pretreatment methods for biomedical relation extraction. Methods: This paper proposes a new data preprocessing method and attempts to apply a pretrained self-attention structure for document biomedical relation extraction with an entity replacement method to capture very long-distance dependencies and complex semantics. Results: Compared with state-of-the-art approaches, our method greatly improved the precision. The results show that our approach increases the F1 value, compared with state-of-the-art methods. Through experiments of biomedical entity pretreatments, we found that a model using an entity replacement method can improve performance. Conclusions: When considering all target entity pairs as a whole in the document-level dataset, a pretrained self-attention structure is suitable to capture very long-distance dependencies and learn the textual context and complicated semantics A replacement method for biomedical entities is conducive to biomedical relation extraction, especially to document-level relation extraction.

Document-Level Biomedical Relation Extraction Leveraging Pretrained Self-Attention Structure and Entity Replacement: Algorithm and Pretreatment Method Validation Study

期刊

JMIR MEDICAL INFORMATICS

出版社

JMIR PUBLICATIONS, INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Document-Level Biomedical Relation Extraction Leveraging Pretrained Self-Attention Structure and Entity Replacement: Algorithm and Pretreatment Method Validation Study

期刊

JMIR MEDICAL INFORMATICS

出版社

JMIR PUBLICATIONS, INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文