4.7 Article

Using data mining techniques to explore security issues in smart living environments in Twitter

Journal

COMPUTER COMMUNICATIONS
Volume 179, Issue -, Pages 285-295

Publisher

ELSEVIER
DOI: 10.1016/j.comcom.2021.08.021

Keywords

Home assistant; IoT; Sentiment analysis; Data mining; Twitter; UGC

Funding

  1. Ministry of Science, Innovation and Universities, Spain
  2. European Regional Development Fund [RTI2018-096295-B-C22]

Ask authors/readers for more resources

The study aimed to explore the main security issues in smart living environments by conducting sentiment analysis and topic classification on tweets, followed by extracting insights and statistical information. It was found that the main security issues include malware, cybersecurity attacks, data storing vulnerabilities, the use of testing software in IoT, and possible leaks due to the lack of user experience.
In present-day in consumers' homes, there are millions of Internet-connected devices that are known to jointly represent the Internet of Things (IoT). The development of the IoT industry has led to the emergence of connected devices and home assistants that create smart living environments. However, the continuously generated data accumulated by these connected devices create security issues and raise user's privacy concerns. The present study aims to explore the main security issues in smart living environments using data mining techniques. To this end, we applied a three-sentence data mining analysis of 9,38,258 tweets collected from Twitter under the user-generated data (UGD) framework. First, sentiment analysis was applied using Textblob which was tested with support vector classifier, multinomial naive bayes, logistic regression, and random forest classifier; as a result, the analyzed tweets were divided into those expressing positive, negative, and neutral sentiment. Next, a Latent Dirichlet Allocation (LDA) algorithm was applied to divide the sample into topics related to security issues in smart living environments. Finally, the insights were extracted by applying a textual analysis process in Python validated with the analysis of frequency and weighted percentage variables and calculating the statistical measure known as mutual information (MI) to analyze the identified n-grams (unigrams and bigrams). As a result of the research 10 topics were identified in which we found that the main security issues are malware, cybersecurity attacks, data storing vulnerabilities, the use of testing software in IoT, and possible leaks due to the lack of user experience. We discussed different circumstances and causes that may affect user security and privacy when using IoT devices and emphasized the importance of UGC in the processing of personal data of IoT device users.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available