4.0 Article

On the prediction of non-CG DNA methylation using machine learning

期刊

NAR GENOMICS AND BIOINFORMATICS
卷 5, 期 2, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nargab/lqad045

关键词

-

向作者/读者索取更多资源

DNA methylation can be detected and measured using sequencing instruments after sodium bisulfite conversion, but experiments can be expensive for large eukaryotic genomes. Several computational methods have been proposed to predict DNA methylation from the DNA sequence or nearby methylation levels, but most are focused on CG methylation in humans and other mammals. In this study, we investigate the prediction of cytosine methylation in plant species and introduce a new classifier, AMPS, that utilizes genomic annotations to improve prediction accuracy.
DNA methylation can be detected and measured using sequencing instruments after sodium bisulfite conversion, but experiments can be expensive for large eukaryotic genomes. Sequencing nonuniformity and mapping biases can leave parts of the genome with low or no coverage, thus hampering the ability of obtaining DNA methylation levels for all cytosines. To address these limitations, several computational methods have been proposed that can predict DNA methylation from the DNA sequence around the cytosine or from the methylation level of nearby cytosines. However, most of these methods are entirely focused on CG methylation in humans and other mammals. In this work, we study, for the first time, the problem of predicting cytosine methylation for CG, CHG and CHH contexts on six plant species, either from the DNA primary sequence around the cytosine or from the methylation levels of neighboring cytosines. In this framework, we also study the cross-species prediction problem and the cross-context prediction problem (within the same species). Finally, we show that providing gene and repeat annotations allows existing classifiers to significantly improve their prediction accuracy. We introduce a new classifier called AMPS (annotation-based methylation prediction from sequence) that takes advantage of genomic annotations to achieve higher accuracy.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据