4.8 Article

AIControl: replacing matched control experiments with machine learning improves ChIP-seq peak identification

期刊

NUCLEIC ACIDS RESEARCH
卷 47, 期 10, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkz156

关键词

-

资金

  1. National Science Foundation CAREER [DBI-1552309, DBI-1355899]
  2. American Cancer Society [127332-RSG-15-097-01-TBG]
  3. NSF CAREER [DBI-1552309]
  4. National Institutes of Health [R35GM128638]

向作者/读者索取更多资源

ChIP-seq is a technique to determine binding locations of transcription factors, which remains a central challenge in molecular biology. Current practice is to use a control' dataset to remove background signals from a immunoprecipitation (IP) target' dataset. We introduce the AIControl framework, which eliminates the need to obtain a control dataset and instead identifies binding peaks by estimating the distributions of background signals from many publicly available control ChIP-seq datasets. We thereby avoid the cost of running control experiments while simultaneously increasing the accuracy of binding location identification. Specifically, AIControl can (i) estimate background signals at fine resolution, (ii) systematically weigh the most appropriate control datasets in a data-driven way, (iii) capture sources of potential biases that may be missed by one control dataset and (iv) remove the need for costly and time-consuming control experiments. We applied AIControl to 410 IP datasets in the ENCODE ChIP-seq database, using 440 control datasets from 107 cell types to impute background signal. Without using matched control datasets, AIControl identified peaks that were more enriched for putative binding sites than those identified by other popular peak callers that used a matched control dataset. We also demonstrated that our framework identifies binding sites that recover documented protein interactions more accurately.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据