☆ 4.5 Article

A ground truth based comparative study on clustering of gene expression data

FRONTIERS IN BIOSCIENCE-LANDMARK (2008)

Journal

FRONTIERS IN BIOSCIENCE-LANDMARK

Volume 13, Issue -, Pages 3839-3849

Publisher

IMR PRESS

DOI: 10.2741/2972

Keywords

clustering evaluation; sample clustering; comparative study; gene expression data

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Given the variety of available clustering methods for gene expression data analysis, it is important to develop an appropriate and rigorous validation scheme to assess the performance and limitations of the most widely used clustering algorithms. In this paper, we present a ground truth based comparative study on the functionality, accuracy, and stability of five data clustering methods, namely hierarchical clustering, K-means clustering, self-organizing maps, standard finite normal mixture fitting, and a caBIG (TM) toolkit (VIsual Statistical Data Analyzer VISDA), tested on sample clustering of seven published microarray gene expression datasets and one synthetic dataset. We examined the performance of these algorithms in both data-sufficient and data-insufficient cases using quantitative performance measures, including cluster number detection accuracy and mean and standard deviation of partition accuracy. The experimental results showed that VISDA, an interactive coarse-to-fine maximum likelihood fitting algorithm, is a solid performer on most of the datasets, while K-means clustering and self-organizing maps optimized by the mean squared compactness criterion generally produce more stable solutions than the other methods.

A ground truth based comparative study on clustering of gene expression data

Journal

FRONTIERS IN BIOSCIENCE-LANDMARK

Publisher

IMR PRESS

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A ground truth based comparative study on clustering of gene expression data

Journal

FRONTIERS IN BIOSCIENCE-LANDMARK

Publisher

IMR PRESS

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper