4.5 Article

Evaluation of the Optimal Topic Classification for Social Media Data Combined with Text Semantics: A Case Study of Public Opinion Analysis Related to COVID-19 with Microblogs

Journal

Publisher

MDPI
DOI: 10.3390/ijgi10120811

Keywords

LDA; topic model; BERT; topic classification; public opinion analysis

Ask authors/readers for more resources

Researchers proposed an evaluation method combining a pre-training model and topic model to determine the optimal topic classification number based on semantic similarity. In an empirical study using COVID-19 as an example, they successfully generated five categories of public opinion topics and found their spatial and temporal distribution patterns consistent with the epidemic development.
Online public opinion reflects social conditions and public attitudes regarding special social events. Therefore, analyzing the temporal and spatial distributions of online public opinion topics can contribute to understanding issues of public concern, grasping and guiding the developing trend of public opinion. However, how to evaluate the validity of classification of online public opinion remains a challenging task in the topic mining field. By combining a Bidirectional Encoder Representations from Transformers (BERT) pre-training model with the Latent Dirichlet Allocation (LDA) topic model, we propose an evaluation method to determine the optimal classification number of topics from the perspective of semantic similarity. The effectiveness of the proposed method was verified based on the standard Chinese corpus THUCNews. Taking Coronavirus Disease 2019 (COVID-19)-related geotagged posts on Weibo in Wuhan city as an example, we used the proposed method to generate five categories of public opinion topics. Combining spatial and temporal information with the classification results, we analyze the spatial and temporal distribution patterns of the five optimal public opinion topics, which are found to be consistent with the epidemic development, demonstrating the feasibility of our method when applied to practical cases.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available