4.6 Article

Causal relationship extraction from biomedical text using deep neural models: A comprehensive survey

Journal

JOURNAL OF BIOMEDICAL INFORMATICS
Volume 119, Issue -, Pages -

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2021.103820

Keywords

Biomedical cause-effect; Relation extraction; Natural language processing; Information extraction

Funding

  1. KU Leuven IWT-SBO [150056]

Ask authors/readers for more resources

Identifying causal relationships between events or entities in biomedical texts is crucial for creating scientific knowledge bases and is a fundamental task in NLP. Despite being an open problem in artificial intelligence, there is increasing research attention on this issue, with new techniques like deep neural networks showing promise in addressing it. Enhancements in state-of-the-art systems can be achieved through data augmentation techniques such as random oversampling to address class imbalance.
The identification of causal relationships between events or entities within biomedical texts is of great importance for creating scientific knowledge bases and is also a fundamental natural language processing (NLP) task. A causal (cause-effect) relation is defined as an association between two events in which the first must occur before the second. Although this task is an open problem in artificial intelligence, and despite its important role in information extraction from the biomedical literature, very few works have considered this problem. However, with the advent of new techniques in machine learning, especially deep neural networks, research increasingly addresses this problem. This paper summarizes state-of-the-art research, its applications, existing datasets, and remaining challenges. For this survey we have implemented and evaluated various techniques including a Multiview CNN (MVC), attention-based BiLSTM models and state-of-the-art word embedding models, such as those obtained with bidirectional encoder representations (ELMo) and transformer architectures (BioBERT). In addition, we have evaluated a graph LSTM as well as a baseline rule based system. We have investigated the class imbalance problem as an innate property of annotated data in this type of task. The results show that a considerable improvement of the results of state-of-the-art systems can be achieved when a simple random oversampling technique for data augmentation is used in order to reduce class imbalance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available