☆ 4.6 Article

Mixture of von Mises-Fisher distribution with sparse prototypes

NEUROCOMPUTING (2022)

期刊

NEUROCOMPUTING

卷 501, 期 -, 页码 41-74

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2022.05.118

关键词

Clustering; Mixtures; Von Mises-Fisher; Expectation maximization; High dimensional data; Path following strategy; Model selection

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The article presents a method for clustering data on the unit hypersphere using mixtures of von Mises-Fisher distributions, which is particularly suitable for high-dimensional directional data. By estimating a sparse von Mises mixture using a penalized likelihood, the clustering interpretability is improved. The approach is evaluated on simulated and real data benchmarks, showing its advantages. Additionally, a new dataset on financial reports is introduced, highlighting the benefits of the method for exploratory analysis.

Mixtures of von Mises-Fisher distributions can be used to cluster data on the unit hypersphere. This is particularly adapted for high-dimensional directional data such as texts. We propose in this article to estimate a von Mises mixture using a l(1) penalized likelihood. This leads to sparse prototypes that improve clustering interpretability. We introduce an expectation-maximisation (EM) algorithm for this estimation , explore the trade-off between the sparsity term and the likelihood one with a path following algorithm. The model's behaviour is studied on simulated data and, we show the advantages of the approach on real data benchmark. We also introduce a new data set on financial reports and exhibit the benefits of our method for exploratory analysis. (C) 2022 Elsevier B.V. All rights reserved.

Mixture of von Mises-Fisher distribution with sparse prototypes

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Mixture of von Mises-Fisher distribution with sparse prototypes

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文