4.6 Article

Scalable clustering for EO data using efficient raster representation

Journal

MULTIMEDIA TOOLS AND APPLICATIONS
Volume 82, Issue 8, Pages 12303-12319

Publisher

SPRINGER
DOI: 10.1007/s11042-022-13726-x

Keywords

Compact Data Structure; k(2)-raster; Earth observation data; Scalable clustering

Ask authors/readers for more resources

Earth Observation (EO) data is a valuable source of information that can be used in various applications. To uncover hidden information, unsupervised learning techniques such as clustering are commonly used. However, traditional clustering algorithms have limitations when processing large EO data due to memory constraints. In this study, we propose a compressed data structure called k(2)-raster for clustering raster data, which improves processing efficiency.
Earth Observation (EO) data is a source of a wide range of information, in vegetation, oceanography, land use, land cover and many more applications. To uncover the hidden information in the data, unsupervised learning techniques like clustering is used popularly. With technological advancements, the amount of data received through satellites rises exponentially, possessing the properties of Big Data. Traditional implementations of clustering algorithms have processing limitations based on the memory capability of the system. In general, applying any clustering algorithm directly to the large EO raster data requires a considerably large amount of time, due to the spatio-temporal nature of the data. The data generated by remote sensors have significant redundant values, so we have made an attempt to use compressed raster data for clustering without decompressing it. In this work, we present the compact data structure, known as k(2)-raster, based technique for clustering raster data. k(2)-raster preserves the spatial context in the image and provides time efficient and lossless compression. We have applied this technique to OCM2-NDVI, in GeoTIFF format, single and stacked image to develop a compressed dataset for efficient representation. As partition based clustering algorithms are widely used in clustering EO data because of their simplicity and time efficiency, we have examined the results on k-means and mini-batch k-means. We have also assessed the performance of our algorithm on model based clustering algorithms. It has been demonstrated that, for datasets with larger numbers of raster values, the developed compact data structure based approach works very well in terms of compact representation of data as well as, with both the partition based and model based clustering techniques making the clustering scalable.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available