☆ 4.2 Article

Band depth based initialization of K-means for functional data clustering

ADVANCES IN DATA ANALYSIS AND CLASSIFICATION (2023)

Journal

ADVANCES IN DATA ANALYSIS AND CLASSIFICATION

Volume 17, Issue 2, Pages 463-484

Publisher

SPRINGER HEIDELBERG

DOI: 10.1007/s11634-022-00510-w

Keywords

k-Means; Modified Band Depth; B-spline; functional data; bootstrap

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The k-Means algorithm is a popular choice for clustering data, but it is known to be sensitive to the initialization process. This paper introduces an extension to the BRIk algorithm for longitudinal data, which clusters centroids derived from bootstrap replicates of the data and utilizes the Modified Band Depth. The proposed approach enhances the BRIk method by fitting B-splines to observations and incorporating a resampling process, resulting in improved effectiveness in providing initial seeds for k-Means clustering.

The k-Means algorithm is one of the most popular choices for clustering data but is well-known to be sensitive to the initialization process. There is a substantial number of methods that aim at finding optimal initial seeds for k-Means, though none of them is universally valid. This paper presents an extension to longitudinal data of one of such methods, the BRIk algorithm, that relies on clustering a set of centroids derived from bootstrap replicates of the data and on the use of the versatile Modified Band Depth. In our approach we improve the BRIk method by adding a step where we fit appropriate B-splines to our observations and a resampling process that allows computational feasibility and handling issues such as noise or missing data. We have derived two techniques for providing suitable initial seeds, each of them stressing respectively the multivariate or the functional nature of the data. Our results with simulated and real data sets indicate that our Functional Data Approach to the BRIK method (FABRIk) and our Functional Data Extension of the BRIK method (FDEBRIk) are more effective than previous proposals at providing seeds to initialize k-Means in terms of clustering recovery.

Band depth based initialization of K-means for functional data clustering

Journal

ADVANCES IN DATA ANALYSIS AND CLASSIFICATION

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Band depth based initialization of K-means for functional data clustering

Journal

ADVANCES IN DATA ANALYSIS AND CLASSIFICATION

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper