☆ 4.7 Article

Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape

BIOINFORMATICS (2017)

期刊

BIOINFORMATICS

卷 33, 期 22, 页码 3575-3583

出版社

OXFORD UNIV PRESS

DOI: 10.1093/bioinformatics/btx480

关键词

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Computer Science, Interdisciplinary Applications Mathematical & Computational Biology Statistics & Probability

资金

King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) [URF/1/1976-04, URF/1/3007-01]
NSF [IIS-1218749, IIS-1639792 EAGER]
NIH [BIGDATA 1R01GM108341]
NSF CAREER [IIS-1350983]
ONR [N00014-15-1-2340]
NVIDIA
Intel
Amazon AWS

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

An accurate characterization of transcription factor (TF)-DNA affinity landscape is crucial to a quantitative understanding of the molecular mechanisms underpinning endogenous gene regulation. While recent advances in biotechnology have brought the opportunity for building binding affinity prediction methods, the accurate characterization of TF-DNA binding affinity landscape still remains a challenging problem. Here we propose a novel sequence embedding approach for modeling the transcription factor binding affinity landscape. Our method represents DNA binding sequences as a hidden Markov model which captures both position specific information and long-range dependency in the sequence. A cornerstone of our method is a novel message passing-like embedding algorithm, called Sequence2Vec, which maps these hidden Markov models into a common nonlinear feature space and uses these embedded features to build a predictive model. Our method is a novel combination of the strength of probabilistic graphical models, feature space embedding and deep learning. We conducted comprehensive experiments on over 90 large-scale TF-DNA datasets which were measured by different high-throughput experimental technologies. Sequence2Vec outperforms alternative machine learning methods as well as the state-of-the-art binding affinity prediction methods.

Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape

期刊

BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape

期刊

BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文