☆ 4.7 Article

Adaptive Ensembling of Semi-Supervised Clustering Solutions

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2017)

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Volume 29, Issue 8, Pages 1577-1590

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TKDE.2017.2695615

Keywords

Clustering ensemble; semi-supervised clustering; clustering

Funding

NSFC [U1611461, 61572199, 61502174, 61502173]
Guangdong Natural Science Funds for Distinguished Young Scholars [S2013050014677]
Science and Technology Planning Project of Guangdong Province, China [2015A050502011, 2016B090918042, 2016A050503015, 2016B010127003]
Fundamental Research Funds for the Central Universities [D2153950]
Research Grants Council of the Hong Kong Special Administrative Region, China [CityU 11300715, 152202/14E]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Conventional semi-supervised clustering approaches have several shortcomings, such as (1) not fully utilizing all useful must-link and cannot-link constraints, (2) not considering how to deal with high dimensional data with noise, and (3) not fully addressing the need to use an adaptive process to further improve the performance of the algorithm. In this paper, we first propose the transitive closure based constraint propagation approach, which makes use of the transitive closure operator and the affinity propagation to address the first limitation. Then, the random subspace based semi-supervised clustering ensemble framework with a set of proposed confidence factors is designed to address the second limitation and provide more stable, robust, and accurate results. Next, the adaptive semi-supervised clustering ensemble framework is proposed to address the third limitation, which adopts a newly designed adaptive process to search for the optimal subspace set. Finally, we adopt a set of nonparametric tests to compare different semi-supervised clustering ensemble approaches over multiple datasets. The experimental results on 20 real high dimensional cancer datasets with noisy genes and 10 datasets from UCI datasets and KEEL datasets show that (1) The proposed approaches work well on most of the real-world datasets. (2) It outperforms other state-of-the-art approaches on 12 out of 20 cancer datasets, and 8 out of 10 UCI machine learning datasets.

Adaptive Ensembling of Semi-Supervised Clustering Solutions

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Adaptive Ensembling of Semi-Supervised Clustering Solutions

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper