☆ 4.7 Review

Assessing deep learning methods in cis-regulatory motif finding based on genomic sequencing data

BRIEFINGS IN BIOINFORMATICS (2022)

Journal

BRIEFINGS IN BIOINFORMATICS

Volume 23, Issue 1, Pages -

Publisher

OXFORD UNIV PRESS

DOI: 10.1093/bib/bbab374

Keywords

TF binding sites identification; motif prediction; CLIP-seq; ChIP-seq; deep learning method assessment

Funding

National Natural Science Foundation of China [62072212, 61772227]
Development Project of Jilin Province of China [20200401083GX, 20200003]
Guangdong Key Project for Applied Fundamental Research [2018KZDXM076]
Jilin Provincial Key Laboratory of Big Data Intelligent Computing [20180622002JC]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Identifying cis-regulatory motifs from genomic sequencing data is crucial for understanding gene regulatory mechanisms. Deep learning methods have been widely used for this purpose, but a systematic evaluation is lacking. In this study, 20 deep learning methods were assessed using different datasets, revealing their high complementarity and the need to choose the most suitable method based on data characteristics and desired outputs.

Identifying cis-regulatory motifs from genomic sequencing data (e.g. ChIP-seq and CLIP-seq) is crucial in identifying transcription factor (TF) binding sites and inferring gene regulatory mechanisms for any organism. Since 2015, deep learning (DL) methods have been widely applied to identify TF binding sites and predict motif patterns, with the strengths of offering a scalable, flexible and unified computational approach for highly accurate predictions. As far as we know, 20 DL methods have been developed. However, without a clear and systematic assessment, users will struggle to choose the most appropriate tool for their specific studies. In this manuscript, we evaluated 20 DL methods for cis-regulatory motif prediction using 690 ENCODE ChIP-seq, 126 cancer ChIP-seq and 55 RNA CLIP-seq data. Four metrics were investigated, including the accuracy of motif finding, the performance of DNA/RNA sequence classification, algorithm scalability and tool usability. The assessment results demonstrated the high complementarity of the existing DL methods. It was determined that the most suitable model should primarily depend on the data size and type and the method's outputs.

Assessing deep learning methods in cis-regulatory motif finding based on genomic sequencing data

Journal

BRIEFINGS IN BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Assessing deep learning methods in cis-regulatory motif finding based on genomic sequencing data

Journal

BRIEFINGS IN BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper