4.7 Article

Fast decoding cell type?specific transcription factor binding landscape at single-nucleotide resolution Hongyang Li and Yuanfang Guan

Journal

GENOME RESEARCH
Volume 31, Issue 4, Pages 721-731

Publisher

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT
DOI: 10.1101/gr.269613.120

Keywords

-

Funding

  1. National Institutes of Health/National Institute of General Medical Sciences [R35-GM133346-01]
  2. National Science Foundation/Division of Biological Infrastructure grant [1452656]
  3. American Heart Association [19AMTG34850176]
  4. Div Of Biological Infrastructure
  5. Direct For Biological Sciences [1452656] Funding Source: National Science Foundation

Ask authors/readers for more resources

The novel deep learning approach Leopard can predict TF binding sites at single-nucleotide resolution with impressive accuracy, outperforming current state-of-the-art methods by a significant margin.
Decoding the cell type?specific transcription factor (TF) binding landscape at single-nucleotide resolution is crucial for understanding the regulatory mechanisms underlying many fundamental biological processes and human diseases. However, limits on time and resources restrict the high-resolution experimental measurements of TF binding profiles of all possible TF?cell type combinations. Previous computational approaches either cannot distinguish the cell context?dependent TF binding profiles across diverse cell types or can only provide a relatively low-resolution prediction. Here we present a novel deep learning approach, Leopard, for predicting TF binding sites at single-nucleotide resolution, achieving the average area under receiver operating characteristic curve (AUROC) of 0.982 and the average area under precision recall curve (AUPRC) of 0.208. Our method substantially outperformed the state-of-the-art methods Anchor and FactorNet, improving the predictive AUPRC by 19% and 27%, respectively, when evaluated at 200-bp resolution. Meanwhile, by leveraging a many-to-many neural network architecture, Leopard features a hundredfold to thousandfold speedup compared with current many-to-one machine learning methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available