4.5 Review

Domain architecture comparison for multidomain homology identification

期刊

JOURNAL OF COMPUTATIONAL BIOLOGY
卷 14, 期 4, 页码 496-516

出版社

MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2007.A009

关键词

homology identification; multidomain proteins; domain architecture comparison

向作者/读者索取更多资源

Homology identification is the first step for many genomic studies. Current methods, based on sequence comparison, can result in a substantial number of mis-assignments due to the similarity of homologous domains in otherwise unrelated sequences. Here we propose methods to detect homologs through explicit comparison of protein domain content. We developed several schemes for scoring the homology of a pair of protein sequences based on methods used in the field of information retrieval. We evaluate the proposed methods and methods used in the literature using a benchmark of fifteen sequence families of known evolutionary history. The results of these studies demonstrate the effectiveness of comparing domain architectures using these similarity measures. We also demonstrate the importance of both weighting promiscuous domains and of compensating for the statistical effect of having a large number of domains in a protein. Using logistic regression, we demonstrate the benefit of combining similarity measures based on domain content with sequence similarity measures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据