4.5 Article

Broadcast news story segmentation using sticky hierarchical dirichlet process

期刊

APPLIED INTELLIGENCE
卷 52, 期 11, 页码 12788-12800

出版社

SPRINGER
DOI: 10.1007/s10489-021-03098-4

关键词

Story segmentation; Non-parametric; HDP prior; SHDP-HMM; Infinite state spaces

向作者/读者索取更多资源

This paper proposes a method called SHDP-HMM, which can automatically infer the number of hidden states from data by defining an HDP prior distribution on transition matrices. Additionally, a parameter is utilized to reduce transition probabilities among redundant states for better modeling the duration of topics. Experimental results show that this approach outperforms traditional HMM-based methods.
Hidden Markov model (HMM) is a popular technique for story segmentation, where hidden Markov states represent the topics. The number of hidden states has to set manually, however, this number is often unknown. This paper proposed a nonparametric approach, called SHDP-HMM, to address this problem. By defining an HDP prior distribution on transition matrices over countably infinite state spaces, SHDP-HMM can infer the number of hidden states from the data automatically. Besides, to better model the duration of topics, we utilize a parameter for self-transition bias that reduces the transition probabilities among redundant hidden states. Given a stream of text, a Gibbs sampler labels the hidden states with topic classes. The position where the topic shifts indicates a story boundary. Experiments show that the proposed SHDP-HMM approach outperforms the traditional HMM-based approaches, and the number of hidden states can be automatically inferred from data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据