4.7 Article

TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants

Journal

FRONTIERS IN PLANT SCIENCE
Volume 14, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fpls.2023.1175837

Keywords

transcription factor binding sites; DenseNet; core motif; biological interpretability; trans-species prediction

Categories

Ask authors/readers for more resources

An emerging approach using promoter tiling deletion via genome editing is becoming popular in plants. However, the precise positions of core motifs within plant gene promoters are largely unknown. In this study, the researchers developed TSPTFBS 2.0, which integrates DenseNet-based models and three interpretability methods to identify potential core motifs in genomic regions. The developed web-server has great potentials for providing reliable editing targets in genetic screen experiments in plants.
Introduction: An emerging approach using promoter tiling deletion via genome editing is beginning to become popular in plants. Identifying the precise positions of core motifs within plant gene promoter is of great demand but they are still largely unknown. We previously developed TSPTFBS of 265 Arabidopsis transcription factor binding sites (TFBSs) prediction models, which now cannot meet the above demand of identifying the core motif. Methods: Here, we additionally introduced 104 maize and 20 rice TFBS datasets and utilized DenseNet for model construction on a large-scale dataset of a total of 389 plant TFs. More importantly, we combined three biological interpretability methods including DeepLIFT, in-silico tiling deletion, and in-silico mutagenesis to identify the potential core motifs of any given genomic region. Results: For the results, DenseNet not only has achieved greater predictability than baseline methods such as LS-GKM and MEME for above 389 TFs from Arabidopsis, maize and rice, but also has greater performance on trans-species prediction of a total of 15 TFs from other six plant species. A motif analysis based on TF-MoDISco and global importance analysis (GIA) further provide the biological implication of the core motif identified by three interpretability methods. Finally, we developed a pipeline of TSPTFBS 2.0, which integrates 389 DenseNet-based models of TF binding and the above three interpretability methods. Discussion: TSPTFBS 2.0 was implemented as a user-friendly web-server (http:// www.hzau-hulab.com/TSPTFBS/), which can support important references for editing targets of any given plant promoters and it has great potentials to provide reliable editing target of genetic screen experiments in plants.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available