☆ 4.7 Article

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding

GENOME RESEARCH (2022)

Journal

GENOME RESEARCH

Volume 32, Issue 3, Pages 512-523

Publisher

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT

DOI: 10.1101/gr.275394.121

Keywords

Funding

National Institutes of Health (NIH) National Institute of General Medical Sciences (NIGMS) [R01GM121613]
National Science Foundation [2045500]
NIH NIGMS [DP2GM123485]
Stanford Graduate Fellowship
NIH National Institute of Diabetes and Digestive and Kidney Diseases [R24DK106766]
Direct For Biological Sciences
Div Of Biological Infrastructure [2045500] Funding Source: National Science Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The DNA sequence preferences and cooperative partners of transcription factors (TFs) are conserved across species. However, predicting TF binding in one species based on sequence models of a closely related species is challenging due to species-specific repeats. To address this challenge, researchers used neural networks to predict TF binding across species and found that the predictive performance was worse than within-species predictions. By using an augmented network architecture, they were able to correct for prediction errors caused by species-specific repeats and improve the overall cross-species model performance.

The intrinsic DNA sequence preferences and cell type-specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type-specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species-specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results show that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats.

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding

Journal

GENOME RESEARCH

Publisher

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding

Journal

GENOME RESEARCH

Publisher

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper