4.7 Article

Deep unsupervised methods towards behavior analysis in ubiquitous sensor data

Journal

INTERNET OF THINGS
Volume 17, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.iot.2021.100486

Keywords

Ubiquitous data analysis; Behavior analysis; Self supervised learning

Ask authors/readers for more resources

The study introduces a novel clustering technique for behavioral analysis, which efficiently uncovers hidden routines and captures patterns in ubiquitous data. Additionally, three different techniques are evaluated for comparative study, showing the high efficiency of the proposed method in handling high-dimensional data.
Behavioral analysis (BA) on ubiquitous sensor data is the task of finding the latent distribution of features for modeling user-specific characteristics. These characteristics, in turn, can be used for a number of tasks including resource management, power efficiency, and smart home applications. In recent years, the employment of topic models for BA has been found to successfully extract the dynamics of the sensed data. Topic modeling is popularly performed on text data for mining inherent topics. The task of finding the latent topics in textual data is done in an unsupervised manner. In this work we propose a novel clustering technique for BA which can find hidden routines in ubiquitous data and also captures the pattern in the routines. Our approach efficiently works on high dimensional data for BA without performing any computationally expensive reduction operations. We evaluate three different techniques namely Latent Dirichlet Allocation (LDA), the Non-negative Matrix Factorization (NMF), and the Probabilistic Latent Semantic Analysis (PLSA) for comparative study. We have analyzed the efficiency of the methods by using performance indices like perplexity and silhouette on three real-world ubiquitous sensor datasets namely, the Intel Lab, Kyoto, and MERL. Through rigorous experiments, we achieve silhouette scores of 0.7049 over the Intel Lab dataset, 0.6547 over the Kyoto dataset, and 0.8312 over the MERL dataset for clustering. In these cases, however, it is di cult to validate the results obtained as the datasets do not contain any ground truth information. Towards that, we investigate a self-supervised method that will be capable of capturing the inherent ground truths that are available in the dataset. We design a self-supervised technique which we apply on datasets containing ground truth and also without. We see that our performance on data without ground truth differs from that with ground truth by approximately 8% (F-score) hence showing the efficacy of self-supervised techniques towards capturing ground truth information.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available