☆ 4.7 Article

A deep learning relation extraction approach to support a biomedical semi-automatic curation task: The case of the gluten bibliome

EXPERT SYSTEMS WITH APPLICATIONS (2022)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 195, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2022.116616

关键词

Text mining; Relation extraction; Deep learning; Ontology-based methods; Literature curation; Gluten

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

资金

Associate Laboratory for Green Chemistry -LAQV - Portuguese Foundation for Science and Technology (FCT/MCTES) [UIDB/50006/2020]
Conselleria de Educacion, Universidades e Formaci 'on Profesional (Xunta de Galicia) [ED431C2018/55-GRC]
Centro singular de investigacion de Galicia (accreditation 2019-2022) - European Regional Development Fund (ERDF) [ED431G2019/06]
Xunta de Galicia [ED481B-2019-032]
Universidade de Vigo/CISUG

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Discovering relevant biomedical interactions is crucial for biology research. This study proposes a novel vector-space integrated with a deep learning model to assist manual curators in a real curation task. Experimental results show that the proposed workflow is valuable for semi-automatic relation extraction and saves manual annotation efforts.

Discover relevant biomedical interactions in the literature is crucial for enhancing biology research. This curation process has an essential role in studying the different processes and interactions reported that affect the biological process (e.g., genome, metabolome, and transcriptome). In this sense, the objective of this work is twofold: reduce the manual effort required to curate and review the existing biochemical interactions reported in the gluten-related bibliome, while proposing a novel vector-space integrated into a deep learning model to assists manual curators in a real curation task by learning from their previous decisions. With this objective, the present work proposes a novel vector-space that combine (i) high-level lexical and syntactic inference features as Wordnets and Health-related domain ontologies, (ii) unsupervised semantic resources as word embedding, (iii) semantic and syntactic sentence knowledge, (iv) abbreviation resolution support, (v) several state-of-the-art Named-entity recognition methods, and, finally, (vi) different feature construction and optimization techniques to support a semi-automatic curation workflow. Therefore, the application of the proposed workflow over a classified set of 2,451 relevant gluten-related documents produces a total of 8,349 relevant and 471,813 irrelevant relations distributed in thirteen domain health-related categories. Experimental results showed that the proposed workflow is a valuable approach for a semi-automatic relation extraction task. It was able to obtain satisfactory results in the early stages of a real-world curation task and saved manual annotation efforts by learning from the decisions made by manual curators in iterative annotation rounds. The average F.score for the proposed relation categories was 0.731, being the lowest F.score at 0.47 and the highest F.score at 0.929. The different resources used in this work as well as the manually curated corpus are public available on our GitHub repository.

A deep learning relation extraction approach to support a biomedical semi-automatic curation task: The case of the gluten bibliome

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A deep learning relation extraction approach to support a biomedical semi-automatic curation task: The case of the gluten bibliome

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文