4.7 Article

Deep-4mCW2V: A sequence-based predictor to identify N4-methylcytosine sites in Escherichia coli

期刊

METHODS
卷 203, 期 -, 页码 558-563

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.ymeth.2021.07.011

关键词

Convolutional neural network; Modification; Feature extraction; Word embedding; N4-methylcytosine

资金

  1. National Natural Science Foundation of China [61772119]
  2. Sichuan Provincial Science Fund for Distinguished Young Scholars [2020JDJQ0012]

向作者/读者索取更多资源

This study developed a deep learning-based model to predict 4mC sites in Escherichia coli. By encoding DNA sequences and utilizing convolutional neural networks for classification, the model can accurately identify 4mC sites, providing a convenient approach for studying 4mC modification.
N4-methylcytosine (4mC) is a type of DNA modification which could regulate several biological progressions such as transcription regulation, replication and gene expressions. Precisely recognizing 4mC sites in genomic sequences can provide specific knowledge about their genetic roles. This study aimed to develop a deep learning based model to predict 4mC sites in the Escherichia coli. In the model, DNA sequences were encoded by word embedding technique 'word2vec'. The obtained features were inputted into 1-D convolutional neural network (CNN) to discriminate 4mC sites from non-4mC sites in Escherichia coli genome. The examination on independent dataset showed that our model could yield the overall accuracy of 0.861, which was about 4.3% higher than the existing model. To provide convenience to scholars, we provided the data and source code of the model which can be freely download from https://github.com/linDing-groups/Deep-4mCW2V.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据