4.6 Article

OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features

期刊

FRONTIERS IN GENETICS
卷 14, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA
DOI: 10.3389/fgene.2023.1139626

关键词

machine learning; sequence embedding; omics; target identification; lung cancer; colon cancer; bioinformatics; deep neural network

向作者/读者索取更多资源

Late-stage drug development failures are often due to ineffective targets. Computational approaches can help identify proper targets by analyzing disease-related biological functions and protein data. OncoRTT, a deep learning method, was developed to predict novel therapeutic targets using features of known effective targets. It achieved high prediction performances for multiple cancer types and outperformed the state-of-the-art method in most cases. Validation evidence from the Open Targets Platform and a case study in lung cancer further supported its effectiveness.
Late-stage drug development failures are usually a consequence of ineffective targets. Thus, proper target identification is needed, which may be possible using computational approaches. The reason being, effective targets have disease-relevant biological functions, and omics data unveil the proteins involved in these functions. Also, properties that favor the existence of binding between drug and target are deducible from the protein's amino acid sequence. In this work, we developed OncoRTT, a deep learning (DL)-based method for predicting novel therapeutic targets. OncoRTT is designed to reduce suboptimal target selection by identifying novel targets based on features of known effective targets using DL approaches. First, we created the OncologyTT datasets, which include genes/proteins associated with ten prevalent cancer types. Then, we generated three sets of features for all genes: omics features, the proteins' amino-acid sequence BERT embeddings, and the integrated features to train and test the DL classifiers separately. The models achieved high prediction performances in terms of area under the curve (AUC), i.e., AUC greater than 0.88 for all cancer types, with a maximum of 0.95 for leukemia. Also, OncoRTT outperformed the state-of-the-art method using their data in five out of seven cancer types commonly assessed by both methods. Furthermore, OncoRTT predicts novel therapeutic targets using new test data related to the seven cancer types. We further corroborated these results with other validation evidence using the Open Targets Platform and a case study focused on the top-10 predicted therapeutic targets for lung cancer.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据