4.4 Article

A Semantic Embedding Enhanced Topic Model For User-Generated Textual Content Modeling In Social Ecosystems

Journal

COMPUTER JOURNAL
Volume 65, Issue 11, Pages 2953-2968

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/comjnl/bxac091

Keywords

Social Ecosystems; User-generated Textual Content; Topic Model; Semantic Embedding; Twitter; Weibo

Funding

  1. National Natural Science Foun-dation of China (NSFC) [61932007, 61902075]

Ask authors/readers for more resources

The development of ICT and Web 2.0 has led to the emergence of diverse social ecosystems. User-generated textual content is the most important type of content in these ecosystems, but current modeling methods have limitations. Therefore, we propose a new model that can accurately model user-generated textual content in social ecosystems.
The development of Information and Communication Technologies (ICT) and Web 2.0 promotes the emergence of diverse social ecosystems like social Internet of Things (IoT), social media and online communities. User-generated textual content (UGTC), which consists of unstructured texts, is the most important and common type of user-generated content in social ecosystems. UGTC in social ecosystems is generated according to two types of context information-global context (topics) and local context (semantic regularities). For UGTC modeling, topic models just consider global context but ignore semantic regularities, while semantic embedding models are on the opposite. So only utilizing topic models or semantic embedding models to model UGTC suffers from some drawbacks. For this problem, we propose a semantic embedding enhanced topic model named SEE-Twitter-LDA for accurately modeling UGTC in social ecosystems. The core of SEE-Twitter-LDA is that words are generated according to mutual semantic information of topics and semantic regularities. So global context and local context are jointly considered for UGTC modeling. By utilizing 553 098 tweets sampled from Twitter and 211 233 posts sampled from Weibo, we validate SEE-Twitter-LDA's better performance on perplexity, topic divergence and topic coherence versus existing related models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available