期刊
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES
卷 125, 期 2, 页码 459-494出版社
TECH SCIENCE PRESS
DOI: 10.32604/cmes.2020.010791
关键词
Ant lion optimization; binary clustering; clustering algorithms; Higgs boson; feature extraction; dimensionality reduction; elbow criterion; genetic algorithm; particle swarm optimization
This paper focuses on the unsupervised detection of the Higgs boson particle using the most informative features and variables which characterize the Higgs machine learning challenge 2014 data set. This unsupervised detection goes in this paper analysis through 4 steps: (1) selection of the most informative features from the considered data; (2) definition of the number of clusters based on the elbow criterion. The experimental results showed that the optimal number of clusters that group the considered data in an unsupervised manner corresponds to 2 clusters; (3) proposition of a new approach for hybridization of both hard and fuzzy clustering tuned with Ant Lion Optimization (ALO); (4) comparison with some existing metaheuristic optimizations such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). By employing a multi-angle analysis based on the cluster validation indices, the confusion matrix, the efficiencies and purities rates, the average cost variation, the computational time and the Sammon mapping visualization, the results highlight the effectiveness of the improved Gustafson-Kessel algorithm optimized withALO(ALOGK) to validate the proposed approach. Even if the paper gives a complete clustering analysis, its novel contribution concerns only the Steps (1) and (3) considered above. The first contribution lies in the method used for Step (1) to select the most informative features and variables. We used the t-Statistic technique to rank them. Afterwards, a feature mapping is applied using Self- Organizing Map (SOM) to identify the level of correlation between them. Then, Particle Swarm Optimization (PSO), a metaheuristic optimization technique, is used to reduce the data set dimension. The second contribution of thiswork concern the third step, where each one of the clustering algorithms as K-means (KM), Global K-means (GlobalKM), Partitioning AroundMedoids (PAM), Fuzzy C-means (FCM), Gustafson-Kessel (GK) and Gath-Geva (GG) is optimized and tuned with ALO.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据