4.8 Article

Accurate prediction of functional states of cis-regulatory modules reveals common epigenetic rules in humans and mice

期刊

BMC BIOLOGY
卷 20, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s12915-022-01426-9

关键词

cis-regulatory modules; Enhancers; Functional states; Machine-learning; Predictions

类别

资金

  1. US National Science Foundation [DBI1661332]

向作者/读者索取更多资源

This study proposes a two-step strategy to accurately predict the distribution of CRMs in the genome and their functional states in various cell/tissue types by integrating ChIP-seq data and using machine learning methods. The results show that functional states of CRMs can be accurately predicted using only 1 to 4 epigenetic marks, and the approach is more cost-effective than existing methods. The study also reveals common epigenetic rules for defining functional states of CRMs in humans and mice.
Background: Predicting cis-regulatory modules (CRMs) in a genome and their functional states in various cell/tissue types of the organism are two related challenging computational tasks. Most current methods attempt to simultaneously achieve both using data of multiple epigenetic marks in a cell/tissue type. Though conceptually attractive, they suffer high false discovery rates and limited applications. To fill the gaps, we proposed a two-step strategy to first predict a map of CRMs in the genome, and then predict functional states of all the CRMs in various cell/tissue types of the organism. We have recently developed an algorithm for the first step that was able to more accurately and completely predict CRMs in a genome than existing methods by integrating numerous transcription factor ChIP-seq datasets in the organism. Here, we presented machine-learning methods for the second step. Results: We showed that functional states in a cell/tissue type of all the CRMs in the genome could be accurately predicted using data of only 1 similar to 4 epigenetic marks by a variety of machine-learning classifiers. Our predictions are substantially more accurate than the best achieved so far. Interestingly, a model trained on a cell/tissue type in humans can accurately predict functional states of CRMs in different cell/tissue types of humans as well as of mice, and vice versa. Therefore, epigenetic code that defines functional states of CRMs in various cell/tissue types is universal at least in humans and mice. Moreover, we found that from tens to hundreds of thousands of CRMs were active in a human and mouse cell/tissue type, and up to 99.98% of them were reutilized in different cell/tissue types, while as small as 0.02% of them were unique to a cell/tissue type that might define the cell/tissue type. Conclusions: Our two-step approach can accurately predict functional states in any cell/tissue type of all the CRMs in the genome using data of only 1 similar to 4 epigenetic marks. Our approach is also more cost-effective than existing methods that typically use data of more epigenetic marks. Our results suggest common epigenetic rules for defining functional states of CRMs in various cell/tissue types in humans and mice.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据