☆ 3.8 Article

Protein sequence redundancy reduction: comparison of various methods

BIOINFORMATION (2010)

Journal

BIOINFORMATION

Volume 5, Issue 6, Pages 234-239

Publisher

BIOMEDICAL INFORMATICS

DOI: 10.6026/97320630005234

Keywords

protein sequence; removing redundancy; sequence alignment

Funding

BIN-II
BIN-III (GEN-AU Austrian research program)
Ministry of Science, Republic of Croatia [036-0362214-1987]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Non-redundant protein datasets are of utmost importance in bioinformatics. Constructing such datasets means removing protein sequences that overreach certain similarity thresholds. Several programs such as 'Decrease redundancy', 'cd-hit', 'Pisces', 'BlastClust' and 'SkipRedundant' are available. The issue that we focus on here is to what extent the non-redundant datasets produced by different programs are similar to each other. A systematic comparison of the features and of the outputs of these programs, by using subsets of the UniProt database, was performed and is described here. The results show high level of overlap between non-redundant datasets obtained with the same program fed with the same initial dataset but different percentage of identity threshold, and moderate levels of similarity between results obtained with different programs fed with the same initial dataset and the same percentage of identity threshold. We must be aware that some differences may arise and the use of more than one computer application is advisable.

Protein sequence redundancy reduction: comparison of various methods

Journal

BIOINFORMATION

Publisher

BIOMEDICAL INFORMATICS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Protein sequence redundancy reduction: comparison of various methods

Journal

BIOINFORMATION

Publisher

BIOMEDICAL INFORMATICS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper