4.7 Article Proceedings Paper

A statistical method for alignment-free comparison of regulatory sequences

向作者/读者索取更多资源

Motivation: The similarity of two biological sequences has traditionally been assessed within the well-established framework of alignment. Here we focus on the task of identifying functional relationships between cis-regulatory sequences that are non-orthologous or greatly diverged. 'Alignment-free' measures of sequence similarity are required in this regime. Results: We investigate the use of a new score for alignment-free sequence comparison, called the D2z score. It is based on comparing the frequencies of all fixed-length words in the two sequences. An important, novel feature of the score is that it is comparable across sequence pairs drawn from arbitrary background distributions. We present a method that gives quadratic improvement in the time complexity of calculating the D2z score, over the naive method. We then evaluate the score on several tissue-specific families of cis-regulatory modules ( in Drosophila and human). The new score is highly successful in discriminating functionally related regulatory sequences from unrelated sequence pairs. The performance of the D2z score is compared to five other alignment-free similarity measures, and shown to be consistently superior to all of these measures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据