4.5 Article

Taking promoters out of enhancers in sequence based predictions of tissue-specific mammalian enhancers

期刊

BMC MEDICAL GENOMICS
卷 10, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/s12920-017-0264-3

关键词

Enhancer prediction; Regulatory sequence; Histone modifications; Machine learning

资金

  1. Polish National Science Center [DEC-2014/12/W/NZ1/00463, DEC-2012/05/B/NZ2/00567]

向作者/读者索取更多资源

Background: Many genetic diseases are caused by mutations in non-coding regions of the genome. These mutations are frequently found in enhancer sequences, causing disruption to the regulatory program of the cell. Enhancers are short regulatory sequences in the non-coding part of the genome that are essential for the proper regulation of transcription. While the experimental methods for identification of such sequences are improving every year, our understanding of the rules behind the enhancer activity has not progressed much in the last decade. This is especially true in case of tissue-specific enhancers, where there are clear problems in predicting specificity of enhancer activity. Results: We show a random-forest based machine learning approach capable of matching the performance of the current state-of-the-art methods for enhancer prediction. Then we show that it is, similarly to other published methods, frequently cross-predicting enhancers as active in different tissues, making it less useful for predicting tissue specific activity. Then we proceed to show that the problem is related to the fact that the enhancer predicting models exhibit a bias towards predicting gene promoters as active enhancers. Then we show that using a two-step classifier can lead to lower cross-prediction between tissues. Conclusions: We provide whole-genome predictions of human heart and brain enhancers obtained with two-step classifier.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据