4.6 Article

SRG-Vote: Predicting Mirna-Gene Relationships via Embedding and LSTM Ensemble

Journal

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS
Volume 26, Issue 8, Pages 4335-4344

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JBHI.2022.3169542

Keywords

Feature extraction; Deep learning; Predictive models; Biological system modeling; Urban areas; Bioinformatics; Data mining; Ensemble; LSTM; miRNA-gene relation- ships

Funding

  1. National Natural Science Foundation of China [32170654, 32000464]
  2. Shenzhen Research Institute, City University of Hong Kong
  3. Health and Medical Research Fund
  4. Food and Health Bureau, The Government of the Hong Kong Special Administrative Region [07181426]
  5. City University of Hong Kong [CityU 11202219, CityU 11203520, CityU 11203221]

Ask authors/readers for more resources

This paper proposes a model that combines feature extraction methods, deep learning algorithms, and a voting system to study the relationship between miRNAs and genes. By using high-throughput technology to process large amounts of biological data, the model is able to reveal potential associations between miRNAs and genes in cancer therapy.
Targeted therapy for one for a set of genes has made it possible to apply precision medicine for different patients due to the existence of tumor heterogeneity. However, how to regulate those genes are still problematic. One of the natural regulators of genes is microRNAs. Thus, a better understanding of the miRNA-gene interaction mechanism might contribute to future diagnosis, prevention, and cancer therapy. The interactions between microRNA and genes play an essential role in molecular genetics. The in-vivo experiments validating the relationships between them are time-consuming, money-costly, and labor-intensive. With the development of high-throughput technology, we dealt with tons of biological data. However, extracting features from tremendous raw data and making a mathematical model is still a challenging topic. Machine learning and deep learning algorithms have become powerful tools in dealing with biological data. Inspired by this, in this paper, we propose a model that combines features/embedding extraction methods, deep learning algorithms, and a voting system. We leverage doc2vec to generate sequential embedding from molecular sequences. The role2vec, GCN, and GMM for geometrical embedding were generated from the complex network from similarity and pair-wise datasets. For the deep learning algorithms, we leveraged LSTM and Bi-LSTM according to different embedding and features. Finally, we adopted a voting system to balance results from different data sources. The results have shown that our voting system could achieve a higher AUC than the existing benchmark. The case studies demonstrate that our model could reveal potential relationships between miRNAs and genes. The source code, features, and predictive results can be downloaded at https://github.com/Xshelton/SRG-vote.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available