4.7 Article

AncestralClust: clustering of divergent nucleotide sequences by ancestral sequence reconstruction using phylogenetic trees

Journal

BIOINFORMATICS
Volume 38, Issue 3, Pages 663-670

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btab723

Keywords

-

Funding

  1. National Institutes of Health [R01GM138634-01]

Ask authors/readers for more resources

The study developed a phylogenetic clustering method called AncestralClust for clustering divergent sequences. Comparison with other state-of-the-art clustering methods showed that AncestralClust has higher accuracy and more even cluster sizes in divergent datasets.
Motivation: Clustering is a fundamental task in the analysis of nucleotide sequences. Despite the exponential increase in the size of sequence databases of homologous genes, few methods exist to cluster divergent sequences. Traditional clustering methods have mostly focused on optimizing high speed clustering of highly similar sequences. We develop a phylogenetic clustering method which infers ancestral sequences for a set of initial clusters and then uses a greedy algorithm to cluster sequences. Results: We describe a clustering program AncestralClust, which is developed for clustering divergent sequences. We compare this method with other state-of-the-art clustering methods using datasets of homologous sequences from different species. We show that, in divergent datasets, AncestralClust has higher accuracy and more even cluster sizes than current popular methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available