4.7 Article

Twitter conversations predict the daily confirmed COVID-19 cases

期刊

APPLIED SOFT COMPUTING
卷 129, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.asoc.2022.109603

关键词

Pandemic forecast; Time series analysis; Social media analytics; Twitter analytics; Granger causality; ARIMAX models; VAR models

资金

  1. University of Melbourne, Australia

向作者/读者索取更多资源

COVID-19 has spread to over 220 countries and territories. The seriousness of the pandemic has increased activity on social media platforms, particularly on Twitter and Weibo. This study aims to incorporate public discourse into forecasting models for ongoing pandemic waves. The results show the presence of social media variables that are causally related to daily COVID-19 cases and that these variables improve forecasting models.
As of writing this paper, COVID-19 (Coronavirus disease 2019) has spread to more than 220 countries and territories. Following the outbreak, the pandemic's seriousness has made people more active on social media, especially on the microblogging platforms such as Twitter and Weibo. The pandemic-specific discourse has remained on-trend on these platforms for months now. Previous studies have confirmed the contributions of such socially generated conversations towards situational awareness of crisis events. The early forecasts of cases are essential to authorities to estimate the requirements of resources needed to cope with the outgrowths of the virus. Therefore, this study attempts to incorporate the public discourse in the design of forecasting models particularly targeted for the steep-hill region of an ongoing wave. We propose a sentiment-involved topic-based latent variables search methodology for designing forecasting models from publicly available Twitter conversations. As a use case, we implement the proposed methodology on Australian COVID-19 daily cases and Twitter conversations generated within the country. Experimental results: (i) show the presence of latent social media variables that Granger-cause the daily COVID-19 confirmed cases, and (ii) confirm that those variables offer additional prediction capability to forecasting models. Further, the results show that the inclusion of social media variables introduces 48.83%-51.38% improvements on RMSE over the baseline models. We also release the large-scale COVID-19 specific geotagged global tweets dataset, MegaGeoCOV, to the public anticipating that the geotagged data of this scale would aid in understanding the conversational dynamics of the pandemic through other spatial and temporal contexts. (c) 2022 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据