4.7 Article

Deep learning for real-time social media text classification for situation awareness - using Hurricanes Sandy, Harvey, and Irma as case studies

Journal

INTERNATIONAL JOURNAL OF DIGITAL EARTH
Volume 12, Issue 11, Pages 1230-1247

Publisher

TAYLOR & FRANCIS LTD
DOI: 10.1080/17538947.2019.1574316

Keywords

Text mining; deep learning; hurricanes; Twitter; convolutional neural network; situational awareness

Funding

  1. National Science Foundation [IIP-1338925]

Ask authors/readers for more resources

Social media platforms have been contributing to disaster management during the past several years. Text mining solutions using traditional machine learning techniques have been developed to categorize the messages into different themes, such as caution and advice, to better understand the meaning and leverage useful information from the social media text content. However, these methods are mostly event specific and difficult to generalize for cross-event classifications. In other words, traditional classification models trained by historic datasets are not capable of categorizing social media messages from a future event. This research examines the capability of a convolutional neural network (CNN) model in cross-event Twitter topic classification based on three geo-tagged twitter datasets collected during Hurricanes Sandy, Harvey, and Irma. The performance of the CNN model is compared to two traditional machine learning methods: support vector machine (SVM) and logistic regression (LR). Experiment results showed that CNN models achieved a consistently better accuracy for both single event and cross-event evaluation scenarios whereas SVM and LR models had lower accuracy compared to their own single event accuracy results. This indicated that the CNN model has the capability of pre-training Twitter data from past events to classify for an upcoming event for situational awareness.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available