☆ 4.8 Article

Local homology recognition and distance measures in linear time using compressed amino acid alphabets

NUCLEIC ACIDS RESEARCH (2004)

期刊

NUCLEIC ACIDS RESEARCH

卷 32, 期 1, 页码 380-385

出版社

OXFORD UNIV PRESS

DOI: 10.1093/nar/gkh180

关键词

类别

Biochemistry & Molecular Biology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Methods for discovery of local similarities and estimation of evolutionary distance by identifying k-mers (contiguous subsequences of length k) common to two sequences are described. Given unaligned sequences of length L, these methods have O(L) time complexity. The ability of compressed amino acid alphabets to extend these techniques to distantly related proteins was investigated. The performance of these algorithms was evaluated for different alphabets and choices of k using a test set of 1848 pairs of structurally alignable sequences selected from the FSSP database. Distance measures derived from k-mer counting were found to correlate well with percentage identity derived from sequence alignments. Compressed alphabets were seen to improve performance in local similarity discovery, but no evidence was found of improvements when applied to distance estimates. The performance of our local similarity discovery method was compared with the fast Fourier transform (FFT) used in MAFFT, which has O(L log L) time complexity. The method for achieving comparable coverage to FFT is revealed here, and is more than an order of magnitude faster. We suggest using k-mer distance for fast, approximate phylogenetic tree construction, and show that a speed improvement of more than three orders of magnitude can be achieved relative to standard distance methods, which require alignments.

Local homology recognition and distance measures in linear time using compressed amino acid alphabets

期刊

NUCLEIC ACIDS RESEARCH

出版社

OXFORD UNIV PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Local homology recognition and distance measures in linear time using compressed amino acid alphabets

期刊

NUCLEIC ACIDS RESEARCH

出版社

OXFORD UNIV PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文