4.7 Article

Anc2vec: embedding gene ontology terms by preserving ancestors relationships

期刊

BRIEFINGS IN BIOINFORMATICS
卷 23, 期 2, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbac003

关键词

gene ontology; neural networks; semantic similarity; protein-protein interactions

资金

  1. Agencia Nacional de Promocion Cientifica y Tecnologica [PICT 2018 #3384]
  2. Universidad Nacional del Litoral [CAI+D 2020 115]

向作者/读者索取更多资源

A novel protocol anc2vec based on neural networks is proposed for constructing vector representations of GO terms, preserving ontological features and showing better performance on diverse tasks.
The gene ontology (GO) provides a hierarchical structure with a controlled vocabulary composed of terms describing functions and localization of gene products. Recent works propose vector representations, also known as embeddings, of GO terms that capture meaningful information about them. Significant performance improvements have been observed when these representations are used on diverse downstream tasks, such as the measurement of semantic similarity between GO terms and functional similarity between proteins. Despite the success shown by these approaches, existing embeddings of GO terms still fail to capture crucial structural features of the GO. Here, we present anc2vec, a novel protocol based on neural networks for constructing vector representations of GO terms by preserving three important ontological features: its ontological uniqueness, ancestors hierarchy and sub-ontology membership. The advantages of using anc2vec are demonstrated by systematic experiments on diverse tasks: visualization, sub-ontology prediction, inference of structurally related terms, retrieval of terms from aggregated embeddings, and prediction of protein-protein interactions. In these tasks, experimental results show that the performance of anc2vec representations is better than those of recent approaches. This demonstrates that higher performances on diverse tasks can be achieved by embeddings when the structure of the GO is better represented. Full source code and data are available at https://github.com/sinclab/anc2vec.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据