4.5 Article

Comparative Analysis of Traditional and Advanced Clustering Techniques in Bioaerosol Data: Evaluating the Efficacy of K-Means, HCA, and GenieClust with and without Autoencoder Integration

期刊

ATMOSPHERE
卷 14, 期 9, 页码 -

出版社

MDPI
DOI: 10.3390/atmos14091416

关键词

PBAP; bioaerosol; UVLIF; WIBS; machine learning (ML); cluster analysis; GenieClust; K-means; HCA; real-time detection and analysis; fungal spores; bacteria; pollen; climate change

向作者/读者索取更多资源

In a comparative study, the capabilities of K-means, hierarchical clustering algorithm (HCA), and GenieClust were examined. K-means and HCA showed consistent cluster profiles and sizes, while GenieClust effectively differentiated various clusters. The use of an autoencoder (AE) enhanced outlier detection for K-means but may have distorted clustering outcomes for HCA. GenieClust, with or without AE, successfully distinguished distinct clusters with greater variability in compositional loadings, identifying more particle types compared to traditional methods.
In a comparative study contrasting new and traditional clustering techniques, the capabilities of K-means, the hierarchal clustering algorithm (HCA), and GenieClust were examined. Both K-means and HCA demonstrated strong consistency in cluster profiles and sizes, emphasizing their effectiveness in differentiating particle types and confirming that the fundamental patterns within the data were captured reliably. An added dimension to the study was the integration of an autoencoder (AE). When coupled with K-means, the AE enhanced outlier detection, particularly in identifying compositional loadings of each cluster. Conversely, whilst the AE's application to all methods revealed a potential for noise reduction by removing infrequent, larger particles, in the case of HCA, this information distortion during the encoding process may have affected the clustering outcomes by reducing the number of observably distinct clusters. The findings from this study indicate that GenieClust, when applied both with and without an AE, was effective in delineating a notable number of distinct clusters. Furthermore, each cluster's compositional loadings exhibited greater internal variability, distinguishing up to 3x more particle types per cluster compared to traditional means, and thus underscoring the algorithms' ability to differentiate subtle data patterns. The work here postulates that the application of GenieClust both with and without an AE may provide important information through initial outlier detection and enriched speciation with an AE applied, evidenced by a greater number of distinct clusters within the main body of the data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据