4.7 Article

circRNA-binding protein site prediction based on multi-view deep learning, subspace learning and multi-view classifier

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 23, Issue 1, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab394

Keywords

circRNA-RBP binding site prediction; deep feature learning; WGCCA; multi-view TSK fuzzy system

Funding

  1. National Natural Science Foundation of China [61772239, 61903248, 61725302, 61671288]
  2. Jiangnan University State Key Laboratory of Food Science and Technology Free Exploration Project [SKLF-ZZB-201901]
  3. Six Talent Peaks Project in Jiangsu Province [XYDXX-056]
  4. Jiangsu Province Natural Science Fund [BK20181339]
  5. Innovation and Technology Fund of the Hong Kong Special Administrative Region of the People's Republic of China [MRF/015/18]
  6. RGC GRF project PolyU [512006/19E]
  7. Shanghai Municipal Science and Technology Major Project [2018SHZDZX01]

Ask authors/readers for more resources

This paper proposes a multi-view classification method called DMSK for the identification of circRNA-RBP interaction sites, based on multi-view deep learning, subspace learning, and multi-view classifier. The method utilizes pseudo-amino acid and pseudo-dipeptide components of circRNA sequences, predicts the secondary structure, extracts context-dependent features, and combines convolutional neural networks with long short-term memory networks to obtain deep features. The proposed method outperforms existing methods in predicting circRNA-RBP interactions.
Circular RNAs (circRNAs) generally bind to RNA-binding proteins (RBPs) to play an important role in the regulation of autoimmune diseases. Thus, it is crucial to study the binding sites of RBPs on circRNAs. Although many methods, including traditional machine learning and deep learning, have been developed to predict the interactions between RNAs and RBPs, and most of them are focused on linear RNAs. At present, few studies have been done on the binding relationships between circRNAs and RBPs. Thus, in-depth research is urgently needed. In the existing circRNA-RBP binding site prediction methods, circRNA sequences are the main research subjects, but the relevant characteristics of circRNAs have not been fully exploited, such as the structure and composition information of circRNA sequences. Some methods have extracted different views to construct recognition models, but how to efficiently use the multi-view data to construct recognition models is still not well studied. Considering the above problems, this paper proposes a multi-view classification method called DMSK based on multi-view deep learning, subspace learning and multi-view classifier for the identification of circRNA-RBP interaction sites. In the DMSK method, first, we converted circRNA sequences into pseudo-amino acid sequences and pseudo-dipeptide components for extracting high-dimensional sequence features and component features of circRNAs, respectively. Then, the structure prediction method RNAfold was used to predict the secondary structure of the RNA sequences, and the sequence embedding model was used to extract the context-dependent features. Next, we fed the above four views' raw features to a hybrid network, which is composed of a convolutional neural network and a long short-term memory network, to obtain the deep features of circRNAs. Furthermore, we used view-weighted generalized canonical correlation analysis to extract four views' common features by subspace learning. Finally, the learned subspace common features and multi-view deep features were fed to train the downstream multi-view TSK fuzzy system to construct a fuzzy rule and fuzzy inference-based multi-view classifier. The trained classifier was used to predict the specific positions of the RBP binding sites on the circRNAs. The experiments show that the prediction performance of the proposed method DMSK has been improved compared with the existing methods. The code and dataset of this study are available at https://github.com/Rebecca3150/DMSK.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available