3.8 Proceedings Paper

OntoSem: an Ontology Semantic Representation Methodology for Biomedical Domain

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/BIBM49941.2020.9313128

Keywords

ontology semantic representation; BERT; Word2Vec; deep learning; semantic similarity

Funding

  1. National Natural Science Foundation of China (NSFC) [61872114, 61305013]

Ask authors/readers for more resources

Ontologies are essential description tools for biomedical concepts and entities, supporting biomedical fundamental research such as semantic similarity analysis, protein-protein interaction prediction and so on. An increasing amount of ontology-like domain knowledge is published in scientific publications, meanwhile, advanced natural language processing (NLP) techniques have been widespread to extract information from text resources automatically, both of which facilitate the exploration of the semantic representation of biomedical ontologies. We propose a novel distributional semantic representation methodology based on the combination of two pre-trained and domain-specific word embedding tools, the non-contextualized Word2Vec and the context-dependent NCBI-blueBERT, to enhance the encoding ability for biomedical ontologies. Furthermore, we utilize a randomly initialized bidirectional LSTM to project the obtained word vector sequence to a fixed-length sentence vector, facilitating a flexible and uniform way for the computation of downstream tasks. We evaluate our method in two categories of tasks: the similarity access of ontology terms, and the ontology annotationbased protein-protein interaction classification. Experimental results demonstrate that our method provides encouraging results compared to the baselines in all tests. Our approach offers promising opportunities for representing ontologies semantics and in turn characterizing entities including proteins in biomedical research.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available