4.7 Article

An alignment-free method to identify candidate orthologous enhancers in multiple Drosophila genomes

Journal

BIOINFORMATICS
Volume 26, Issue 17, Pages 2109-2115

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btq358

Keywords

-

Funding

  1. National Institutes of Health [R01 HG004065]
  2. Human Frontier Science Program [RGY0084/2008]

Ask authors/readers for more resources

Motivation: Evolutionarily conserved non-coding genomic sequences represent a potentially rich source for the discovery of gene regulatory region such as transcriptional enhancers. However, detecting orthologous enhancers using alignment-based methods in higher eukaryotic genomes is particularly challenging, as regulatory regions can undergo considerable sequence changes while maintaining their functionality. Results: We have developed an alignment-free method which identifies conserved enhancers in multiple diverged species. Our method is based on similarity metrics between two sequences based on the co-occurrence of sequence patterns regardless of their order and orientation, thus tolerating sequence changes observed in non-coding evolution. We show that our method is highly successful in detecting orthologous enhancers in distantly related species without requiring additional information such as knowledge about transcription factors involved, or predicted binding sites. By estimating the significance of similarity scores, we are able to discriminate experimentally validated functional enhancers from seemingly equally conserved candidates without function. We demonstrate the effectiveness of this approach on a wide range of enhancers in Drosophila, and also present encouraging results to detect conserved functional regions across large evolutionary distances. Our work provides encouraging steps on the way to oh initio unbiased enhancer prediction to complement ongoing experimental efforts.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available