4.7 Article

Automatic trend detection: Time-biased document clustering

期刊

KNOWLEDGE-BASED SYSTEMS
卷 220, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2021.106907

关键词

Text mining; Trend detection; Temporal biased clustering; Machine learning

向作者/读者索取更多资源

The study demonstrates the importance of emphasizing time in trend detection by introducing a weighted temporal feature. By analyzing finance journal abstracts, trending finance topics that are not identifiable with standard clustering methods are discovered. The use of silhouette score divided by standard deviation to identify and validate trending topics showcases the effectiveness of the approach.
Identifying the trending topics in journals and conferences is valuable for understanding the role of authors, institutions, and funding agencies in the progression of knowledge produced in the field. However, many available clustering methods do not accommodate a desire for temporally clustered results that are typical of trends, in part because time of publication is often neglected as a feature. As a demonstration of how time can be emphasized in trend detection, we use a novel approach of introducing a weighted temporal feature to bias a topic clustering toward articles in a similar time frame; this is performed over a set of finance journal abstracts from 1974 to 2020. Latent Dirichlet Allocation (LDA) is used to parameterize each abstract, followed by dimensionality reduction using Singular Value Decomposition (SVD). We detect trending finance topics that are not identifiable when we use a standard clustering approach with no temporal bias. To identify trending topics, we utilize a metric of the silhouette score divided by the standard deviation of clusters over time. We then isolate topics identified by this metric and validate them using expert judgment. Our clustering strategy using temporal bias can be readily utilized in other fields for discovering the rise and fall of trends.(c) 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据