4.7 Article

T-REKS: identification of Tandem REpeats in sequences with a K-meanS based algorithm

期刊

BIOINFORMATICS
卷 25, 期 20, 页码 2632-2638

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btp482

关键词

-

资金

  1. Ministere de l'Education Nationale, de la Recherche et de la Technologie (MENRT)

向作者/读者索取更多资源

Motivation: Over the last years a number of evidences have been accumulated about high incidence of tandem repeats in proteins carrying fundamental biological functions and being related to a number of human diseases. At the same time, frequently, protein repeats are strongly degenerated during evolution and, therefore, cannot be easily identified. To solve this problem, several computer programs which were based on different algorithms have been developed. Nevertheless, our tests showed that there is still room for improvement of methods for accurate and rapid detection of tandem repeats in proteins. Results: We developed a new program called T-REKS for ab initio identification of the tandem repeats. It is based on clustering of lengths between identical short strings by using a K-means algorithm. Benchmark of the existing programs and T-REKS on several sequence datasets is presented. Our program being linked to the Protein Repeat DataBase opens the way for large-scale analysis of protein tandem repeats. T-REKS can also be applied to the nucleotide sequences.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据