4.5 Article

DeepSite: bidirectional LSTM and CNN models for predicting DNA-protein binding

Journal

Publisher

SPRINGER HEIDELBERG
DOI: 10.1007/s13042-019-00990-x

Keywords

DNA-protein binding; Deep learning; Bidirectional long short-term memory; Convolutional neural networks

Ask authors/readers for more resources

Transcription factors are cis-regulatory molecules that bind to specific sub-regions of DNA promoters and initiate transcription, the process that regulates the conversion of genetic information from DNA to RNA. Several computational methods have been developed to predict DNA-protein binding sites in DNA sequence using convolutional neural network (CNN). However, these techniques could indicate the dependency information of DNA sequence information in the framework of CNN. In addition, these methods are not accurate enough in prediction of the DNA-protein binding sites from the DNA sequence. In this study, we employ the bidirectional long short-term memory (BLSTM) and CNN to capture long-term dependencies between the sequence motifs in DNA, which is called DeepSite. Apart from traditional CNN, which includes six layers: input layer, BLSTM layer, CNN layer, pooling layer, full connection layer and output layer, DeepSite approach can predict DNA-protein binding sites with 87.12% sensitivity, 91.06% specificity, 89.19% accuracy and 0.783 MCC, when tested on the 690 Chip-seq experiments from ENCODE. Lastly, we conclude that our proposed method can also be applied to find DNA-protein binding sites in different DNA sequences.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available