期刊
COMPUTERS IN BIOLOGY AND MEDICINE
卷 166, 期 -, 页码 -出版社
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2023.107535
关键词
Dense retrieval; Natural language processing; Knowledge graph; Semantic matching
This paper proposes a method to incorporate external knowledge into a dense retrieval model to improve the effectiveness of biomedical retrieval based on pre-trained language models. Experimental results demonstrate that the proposed method outperforms existing methods and has relatively low query latency.
In recent years, pre-trained language models (PLMs) have dominated natural language processing (NLP) and achieved outstanding performance in various NLP tasks, including dense retrieval based on PLMs. However, in the biomedical domain, the effectiveness of dense retrieval models based on PLMs still needs to be improved due to the diversity and ambiguity of entity expressions caused by the enrichment of biomedical entities. To alleviate the semantic gap, in this paper, we propose a method that incorporates external knowledge at the entity level into a dense retrieval model to enrich the dense representations of queries and documents. Specifically, we first add additional self-attention and information interaction modules in the Transformer layer of the BERT archi-tecture to perform fusion and interaction between query/document text and entity embeddings from knowledge graphs. We then propose an entity similarity loss to constrain the model to better learn external knowledge from entity embeddings, and further propose a weighted entity concatenation mechanism to balance the impact of entity representations when matching queries and documents. Experiments on two publicly available biomedical retrieval datasets show that our proposed method outperforms state-of-the-art dense retrieval methods. In term of NDCG metrics, the proposed method (called ELK) improves the ranking performance of coCondenser by at least 5% on both two datasets, and also obtains further performance gain over state-of-the-art EVA methods. Though having a more sophisticated architecture, the average query latency of ELK is still within the same order of magnitude as that of other efficient methods.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据