4.7 Article

Neural network modeling of differential binding between wild-type and mutant CTCF reveals putative binding preferences for zinc fingers 1-2

Journal

BMC GENOMICS
Volume 23, Issue 1, Pages -

Publisher

BMC
DOI: 10.1186/s12864-022-08486-9

Keywords

Mutated transcription factor; CTCF; Zinc finger; Motif; Deep neural network; Binding strength

Funding

  1. Stanford Center for Computational, Evolutionary and Human Genomics Predoctoral Fellowship
  2. Carnegie Mellon University Computational Biology Department Lane Fellowship

Ask authors/readers for more resources

In this study, the researchers developed a new approach to identify the binding motifs of individual DNA binding domains (DBDs) of a transcription factor (TF). By analyzing chromatin immunoprecipitation sequencing (ChIP-seq) data, they trained a deep convolutional neural network to predict the preservation of wild-type TF binding sites in mutant TF datasets. They applied this approach to mouse CTCF ChIP-seq data and successfully identified the binding preferences of CTCF ZFs 3-11 as well as a putative GAG binding motif for ZF 1. Their findings provide new insights into the binding preferences of CTCF's DBDs.
Background Many transcription factors (TFs), such as multi zinc-finger (ZF) TFs, have multiple DNA binding domains (DBDs), and deciphering the DNA binding motifs of individual DBDs is a major challenge. One example of such a TF is CCCTC-binding factor (CTCF), a TF with eleven ZFs that plays a variety of roles in transcriptional regulation, most notably anchoring DNA loops. Previous studies found that CTCF ZFs 3-7 bind CTCF's core motif and ZFs 9-11 bind a specific upstream motif, but the motifs of ZFs 1-2 have yet to be identified. Results We developed a new approach to identifying the binding motifs of individual DBDs of a TF through analyzing chromatin immunoprecipitation sequencing (ChIP-seq) experiments in which a single DBD is mutated: we train a deep convolutional neural network to predict whether wild-type TF binding sites are preserved in the mutant TF dataset and interpret the model. We applied this approach to mouse CTCF ChIP-seq data and identified the known binding preferences of CTCF ZFs 3-11 as well as a putative GAG binding motif for ZF 1. We analyzed other CTCF datasets to provide additional evidence that ZF 1 is associated with binding at the motif we identified, and we found that the presence of the motif for ZF 1 is associated with CTCF ChIP-seq peak strength. Conclusions Our approach can be applied to any TF for which in vivo binding data from both the wild-type and mutated versions of the TF are available, and our findings provide new potential insights binding preferences of CTCF's DBDs.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available