Journal
INFORMATION SCIENCES
Volume 607, Issue -, Pages 79-91Publisher
ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2022.05.098
Keywords
Neural topic model; Short texts; Context reinforcement
Categories
Funding
- National Natural Science Foundation of China [61972426]
- Guangdong Basic and Applied Basic Research Foundation [2020A1515010536]
- Research Grants Council of Hong Kong Special Administrative Region, China [UGC/FDS16/E01/19]
- Research Grants Council of the Hong Kong Special Administrative Region, China
- Direct Grant [DR22A2]
- Faculty Research Grants of Lingnan University, Hong Kong [DB22B4, DB22B7]
Ask authors/readers for more resources
This article introduces a Context Reinforced Neural Topic Model (CRNTM) to address the issue of feature sparsity in short texts. The proposed model infers topics for each word in a narrow range and utilizes pre-trained word embeddings for topic modeling. Extensive experiments validate the effectiveness of this model in topic discovery and text classification.
As one of the prevalent topic mining methods, neural topic modeling has attracted a lot of interests due to the advantages of low training costs and strong generalisation abilities. However, the existing neural topic models may suffer from the feature sparsity problem when applied to short texts, due to the lack of context in each message. To alleviate this issue, we propose a Context Reinforced Neural Topic Model (CRNTM), whose characteristics can be summarized as follows. First, by assuming that each short text covers only a few salient topics, the proposed CRNTM infers the topic for each word in a narrow range. Second, our model exploits pre-trained word embeddings by treating topics as multivariate Gaussian distributions or Gaussian mixture distributions in the embedding space. Extensive experiments on two benchmark short corpora validate the effectiveness of the proposed model on both topic discovery and text classification.(c) 2022 Elsevier Inc. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available