3.8 Proceedings Paper

A Short Text Topic Model Based on Semantics and Word Expansion

Publisher

IEEE
DOI: 10.1109/CCAI55564.2022.9807822

Keywords

Topic analysis; BTM; semantic analysis; short text

Funding

  1. National Natural Science Foundation of China [62176033, 61936001]
  2. Key Cooperation Project of Chongqing Municipal Education Commission [HZ2021008]
  3. Natural Science Foundation of Chongqing [cstc2019jcyj-cxttX0002]

Ask authors/readers for more resources

This paper focuses on the sparsity problem of short text datasets and proposes a biterm acquisition method based on semantic dependencies to enhance the semantic relevance between words. Additionally, it suggests expanding the number of biterms through similarity calculation and relationship calculation to further address the text sparse problem and enhance the topic tendency of text.
In recent years, with the increasing amount of short text information, there are more and more researches on short text information, and the topic information analysis of short texts is one of the key researches. In order to overcome the sparsity problem of short text datasets, this paper conducts research on the basis of the short text topic model Biterm Topic Model (BTM). Aiming at the problem of lack of semantic association in BTM model, this paper proposes a biterm acquisition method based on semantic dependencies. The method firstly apply semantic analysis on the text, and then combines words with strong correlation into biterm. The semantic relevance between words in biterm is enhanced. In order to further solve the text sparse problem, this paper proposes to expand the number of biterms based on similarity calculation of words and calculation of relationship between words. This method not only solves the sparsity problem, but also enhances the topic tendency of text.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available