☆ 4.7 Article

Machine learning and Natural Language Processing of social media data for event detection in smart cities

SUSTAINABLE CITIES AND SOCIETY (2022)

期刊

SUSTAINABLE CITIES AND SOCIETY

卷 85, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.scs.2022.104026

关键词

Socialmedia; Smartcities; Eventdetection; NaturalLanguageProcessing; Citizensatisfaction; Machinelearning

类别

Construction & Building Technology Green & Sustainable Science & Technology Energy & Fuels

资金

EPSRC SemanticLCA Project [EP/T019514/1]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This article highlights the importance of social media data analysis in decision making within the context of a smart city. It presents the use of Natural Language Processing (NLP) techniques for real-time automated event detection on Twitter. Through semantic-based taxonomy and multiple regression analysis, it reveals the relationships between events and citizen satisfaction, providing valuable data for informed decision making.

Social media data analysis in a smart city context can represent an efficacious instrument to inform decision making. The manuscript strives to leverage the power of Natural Language Processing (NLP) techniques applied to Twitter messages using supervised learning to achieve real-time automated event detection in smart cities. A semantic-based taxonomy of risks is devised to discover and analyse associated events from data streams, with a view to: (i) read and process, in real-time, published texts (ii) classify each text into one representative real -world category (iii) assign a citizen satisfaction value to each event. To select the language processing models striking the best balance between accuracy and processing speed, we conducted a pre-emptive evaluation, comparing several baseline language models formerly employed by researchers for event classification. A heuristic analysis of several smart cities and community initiatives was conducted, with a view to define real-world scenarios as basis for determining correlations between two or more co-occurring event types and their associated levels of citizen satisfaction, while further considering environmental factors. Based on Multiple Regression Analysis (MRA), we established the relationships between scenario variables, obtaining a variance of 60%-90% between the dependent and independent variables. The selected combination of supervised NLP techniques leverages an accuracy of 88.5%. We found that all regression models had at least one variable below the 0.05 threshold of the f - test, therefore at least one statistically significant independent variable. These findings ultimately illustrate how citizens, taking the role of active social sensors, can yield vital data that authorities can use to make educated decisions and sustainably construct smarter cities.

Machine learning and Natural Language Processing of social media data for event detection in smart cities

期刊

SUSTAINABLE CITIES AND SOCIETY

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Machine learning and Natural Language Processing of social media data for event detection in smart cities

期刊

SUSTAINABLE CITIES AND SOCIETY

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文