4.6 Article

DeepFusion-RBP: Using Deep Learning to Fuse Multiple Features to Identify RNA-binding Protein Sequences

Journal

CURRENT BIOINFORMATICS
Volume 16, Issue 8, Pages 1089-1100

Publisher

BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/1574893616666210618145121

Keywords

RNA-binding protein; LSTM; deep learning; PSSM; protein sequence; word embedding

Funding

  1. Na-tional Natural Science Foundation of China [62062067, 11661081]
  2. Natural Science Foundation of Yunnan Province [2017FA032]
  3. Training Plan for Young and Middle-aged Academic Leaders of Yunnan Province [2018HB031]

Ask authors/readers for more resources

The study introduces a deep learning framework DeepFusion-RBP that cuts RNA sequences with a sliding window method and customizes models for different features, achieving accurate classification of RNA-binding proteins.
Background: RNA-binding protein plays an important role in regulating splicing, RNA transport, and other post-transcriptional processes, identifying special RNA binding domains, and interacting with RNA. Objective: This paper proposes a deep learning framework, DeepFusion-RBP, composed of three submodels. A sliding window is used to obtain sub-sequences, local features are obtained, and then the model is customized for each feature. Methods: The main advantage of this research is using the sliding window method to cut the original sequence. While expanding the data set, this method avoids filling in too much meaningless data. Then, the model is customized for each feature to accurately perform RNA binding protein classification, with specific methods such as LSTM, Conv1D, Amino acid embedding, etc. Results: To test whether the customized model can improve the final prediction effect, we used different combinations of sub-models and test sets of different lengths. The prediction ACC, F1-score and MCC of DeepFusion-RBP are 92.62%, 91.29%, and 84.96%, respectively, with cross-validation. At the same time, DeepFusion-RBP also showed excellent performance on three independent verification sets. Conclusion: The results of 10-fold cross-validation and the independent verification set tests both suggested that the proposed models for different features and intercepting sub-sequences produce a certain improvement in the prediction effect of the model. The data supporting the findings of the article are available at https://github.com/mmwangxu/DeepFusion-RBP-tool.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available