4.5 Article

Syndromic surveillance of Flu on Twitter using weakly supervised temporal topic models

期刊

DATA MINING AND KNOWLEDGE DISCOVERY
卷 30, 期 3, 页码 681-710

出版社

SPRINGER
DOI: 10.1007/s10618-015-0434-x

关键词

Syndromic surveillance; Social media; Topic model; Hidden Markov model

资金

  1. National Science Foundation [IIS-1353346]
  2. Maryland Procurement Office [H98230-14-C-0127]
  3. Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center (DoI/NBC) [D12PC000337]
  4. VT College of Engineering

向作者/读者索取更多资源

Surveillance of epidemic outbreaks and spread from social media is an important tool for governments and public health authorities. Machine learning techniques for nowcasting the Flu have made significant inroads into correlating social media trends to case counts and prevalence of epidemics in a population. There is a disconnect between data-driven methods for forecasting Flu incidence and epidemiological models that adopt a state based understanding of transitions, that can lead to sub-optimal predictions. Furthermore, models for epidemiological activity and social activity like on Twitter predict different shapes and have important differences. In this paper, we propose two temporal topic models (one unsupervised model as well as one improved weakly-supervised model) to capture hidden states of a user from his tweets and aggregate states in a geographical region for better estimation of trends. We show that our approaches help fill the gap between phenomenological methods for disease surveillance and epidemiological models. We validate our approaches by modeling the Flu using Twitter in multiple countries of South America. We demonstrate that our models can consistently outperform plain vocabulary assessment in Flu case-count predictions, and at the same time get better Flu-peak predictions than competitors. We also show that our fine-grained modeling can reconcile some contrasting behaviors between epidemiological and social models.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据