4.7 Article

Clustering of highly homologous sequences to reduce the size of large protein databases

Journal

BIOINFORMATICS
Volume 17, Issue 3, Pages 282-283

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/17.3.282

Keywords

-

Funding

  1. NIGMS NIH HHS [GM60049] Funding Source: Medline

Ask authors/readers for more resources

We present a fast and flexible program for clustering large protein databases at different sequence identity levels. It takes less than 2 h for the all-against-all sequence comparison and clustering of the non-redundant protein database of over 560 000 sequences on a high-end PC. The output database, including only the representative sequences, can be used for more efficient and sensitive database searches.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available