4.7 Article

Capturing large genomic contexts for accurately predicting enhancer-promoter interactions

期刊

BRIEFINGS IN BIOINFORMATICS
卷 23, 期 2, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab577

关键词

enhancer-promoter interaction; chromatin structure; Transformer; non-coding mutation

资金

  1. National Key R&D Program of China [2020YFB0204803]
  2. National Natural Science Foundation of China [61772566]
  3. Guangdong Key Field RD Plan [2019B020228001, 2018B010109006]
  4. Guangzhou ST Research Plan [202007030010]
  5. Introducing Innovative and Entrepreneurial Teams [2016ZT06D211]

向作者/读者索取更多资源

This study presents a Transformer-based model, TransEPI, for predicting enhancer-promoter interactions (EPI) by capturing large genomic contexts. TransEPI consistently outperforms other machine learning and deep learning models on datasets from different cell types.
Enhancer-promoter interaction (EPI) is a key mechanism underlying gene regulation. EPI prediction has always been a challenging task because enhancers could regulate promoters of distant target genes. Although many machine learning models have been developed, they leverage only the features in enhancers and promoters, or simply add the average genomic signals in the regions between enhancers and promoters, without utilizing detailed features between or outside enhancers and promoters. Due to a lack of large-scale features, existing methods could achieve only moderate performance, especially for predicting EPIs in different cell types. Here, we present a Transformer-based model, TransEPI, for EPI prediction by capturing large genomic contexts. TransEPI was developed based on EPI datasets derived from Hi-C or ChIA-PET data in six cell lines. To avoid over-fitting, we evaluated the TransEPI model by testing it on independent test datasets where the cell line and chromosome are different from the training data. TransEPI not only achieved consistent performance across the cross-validation and test datasets from different cell types but also outperformed the state-of-the-art machine learning and deep learning models. In addition, we found that the improved performance of TransEPI was attributed to the integration of large genomic contexts. Lastly, TransEPI was extended to study the non-coding mutations associated with brain disorders or neural diseases, and we found that TransEPI was also useful for predicting the target genes of non-coding mutations.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据