☆ 4.7 Article

MetaAc4C: A multi-module deep learning framework for accurate prediction of N4-acetylcytidine sites based on pre-trained bidirectional encoder representation and generative adversarial networks

GENOMICS (2024)

期刊

GENOMICS

卷 116, 期 1, 页码 -

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

DOI: 10.1016/j.ygeno.2023.110749

关键词

N4-acetylcytidine; Deep learning; MetaAc4C

类别

Biotechnology & Applied Microbiology Genetics & Heredity

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this study, we propose MetaAc4C, an advanced deep learning model for accurate identification of N4-acetylcytidine (ac4C) sites using pre-trained BERT and various optimization techniques. By adapting generative adversarial networks to address data imbalance and augmenting training RNA samples, our model outperforms existing methods in terms of ACC, MCC, and AUROC.

Motivation: N4-acetylcytidine (ac4C) is a highly conserved RNA modification that plays a crucial role in various biological processes. Accurately identifying ac4C sites is of paramount importance for gaining a deeper understanding of their regulatory mechanisms. Nevertheless, the existing experimental techniques for ac4C site identification are characterized by limitations in terms of cost-effectiveness, while the performance of current computational methods in accurately identifying ac4C sites requires further enhancement.Results: In this paper, we present MetaAc4C, an advanced deep learning model that leverages pre-trained bidirectional encoder representations from transformers (BERT). The model is based on a bi-directional long shortterm memory network (BLSTM) architecture, incorporating attention mechanism and residual connection. To address the issue of data imbalance, we adapt generative adversarial networks to generate synthetic feature samples. On the independent test set, MetaAc4C surpasses the current state-of-the-art ac4C prediction model, exhibiting improvements in terms of ACC, MCC, and AUROC by 2.36%, 4.76%, and 3.11%, respectively, on the unbalanced dataset. When evaluated on the balanced dataset, MetaAc4C achieves improvements in ACC, MCC, and AUROC by 2.6%, 5.11%, and 1.01%, respectively. Notably, our approach of utilizing WGAN-GP augmented training RNA samples demonstrates even superior performance compared to the SMOTE oversampling method.

MetaAc4C: A multi-module deep learning framework for accurate prediction of N4-acetylcytidine sites based on pre-trained bidirectional encoder representation and generative adversarial networks

期刊

GENOMICS

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

MetaAc4C: A multi-module deep learning framework for accurate prediction of N4-acetylcytidine sites based on pre-trained bidirectional encoder representation and generative adversarial networks

期刊

GENOMICS

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文