4.6 Article

Mixture of von Mises-Fisher distribution with sparse prototypes

期刊

NEUROCOMPUTING
卷 501, 期 -, 页码 41-74

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2022.05.118

关键词

Clustering; Mixtures; Von Mises-Fisher; Expectation maximization; High dimensional data; Path following strategy; Model selection

向作者/读者索取更多资源

The article presents a method for clustering data on the unit hypersphere using mixtures of von Mises-Fisher distributions, which is particularly suitable for high-dimensional directional data. By estimating a sparse von Mises mixture using a penalized likelihood, the clustering interpretability is improved. The approach is evaluated on simulated and real data benchmarks, showing its advantages. Additionally, a new dataset on financial reports is introduced, highlighting the benefits of the method for exploratory analysis.
Mixtures of von Mises-Fisher distributions can be used to cluster data on the unit hypersphere. This is particularly adapted for high-dimensional directional data such as texts. We propose in this article to estimate a von Mises mixture using a l(1) penalized likelihood. This leads to sparse prototypes that improve clustering interpretability. We introduce an expectation-maximisation (EM) algorithm for this estimation , explore the trade-off between the sparsity term and the likelihood one with a path following algorithm. The model's behaviour is studied on simulated data and, we show the advantages of the approach on real data benchmark. We also introduce a new data set on financial reports and exhibit the benefits of our method for exploratory analysis. (C) 2022 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据