☆ 4.7 Article

High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method

BRIEFINGS IN BIOINFORMATICS (2021)

Journal

BRIEFINGS IN BIOINFORMATICS

Volume 22, Issue 6, Pages -

Publisher

OXFORD UNIV PRESS

DOI: 10.1093/bib/bbab273

Keywords

transcription factor binding sites; Attention Gate; interpretability; motif discovery

Funding

National Natural Science Foundation of China [61922020, 61702058]
China Postdoctoral Science Foundation [2017M612948]
Scientific Research Foundation for Education Department of Sichuan Province [18ZA0098]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Research has shown that deep learning models can predict TFBSs at the base-pair level, but there is a need for improvement to integrate information more effectively for increased accuracy.

Transcription factors (TFs) are essential proteins in regulating the spatiotemporal expression of genes. It is crucial to infer the potential transcription factor binding sites (TFBSs) with high resolution to promote biology and realize precision medicine. Recently, deep learning-based models have shown exemplary performance in the prediction of TFBSs at the base-pair level. However, the previous models fail to integrate nucleotide position information and semantic information without noisy responses. Thus, there is still room for improvement. Moreover, both the inner mechanism and prediction results of these models are challenging to interpret. To this end, the Deep Attentive Encoder-Decoder Neural Network (D-AEDNet) is developed to identify the location of TFs-DNA binding sites in DNA sequences. In particular, our model adopts Skip Architecture to leverage the nucleotide position information in the encoder and removes noisy responses in the information fusion process by Attention Gate. Simultaneously, the Transcription Factor Motif Discovery based on Sliding Window (TF-MoDSW), an approach to discover TFs-DNA binding motifs by utilizing the output of neural networks, is proposed to understand the biological meaning of the predicted result. On ChIP-exo datasets, experimental results show that D-AEDNet has better performance than competing methods. Besides, we authenticate that Attention Gate can improve the interpretability of our model by ways of visualization analysis. Furthermore, we confirm that ability of D-AEDNet to learn TFs-DNA binding motifs outperform the state-of-the-art methods and availability of TF-MoDSW to discover biological sequence motifs in TFs-DNA interaction by conducting experiment on ChIP-seq datasets.

High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method

Journal

BRIEFINGS IN BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method

Journal

BRIEFINGS IN BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper