4.8 Article

FuzzyAct: A Fuzzy-Based Framework for Temporal Activity Recognition in IoT Applications Using RNN and 3D-DWT

Journal

IEEE TRANSACTIONS ON FUZZY SYSTEMS
Volume 30, Issue 11, Pages 4578-4592

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TFUZZ.2022.3152106

Keywords

Feature extraction; Activity recognition; Three-dimensional displays; Discrete wavelet transforms; Deep learning; Task analysis; Internet of Things; Action recognition; edge computing; Internet of things (IoT); recurrent neural network (RNN); three-dimensional discrete wavelet transform (3D-DWT)

Funding

  1. Natural Science Foundation of China [61836013]
  2. Beijing Natural Science Foundation [4212030]
  3. Beijing Nova Program of Science and Technology [Z191100001119090]
  4. Key Research Program of Frontier Sciences, CAS [ZDBS-LY-DQC016]
  5. Science Foundation Ireland
  6. Department of Agriculture, Food, and Marine via the VistaMilk Research Centre [16/RC/3835]

Ask authors/readers for more resources

This article presents a method that combines discrete wavelet transform (DWT) and recurrent neural network (RNN) for accurate classification and detection of human activities. The proposed approach extracts features using 3D-DWT and produces output labels using RNN. A rank-based fuzzy method is also used to accurately segregate activities. In experiments, the method achieves good performance on the ActivityNet dataset.
Despite massive research in deep learning, the human activity recognition (HAR) domain still suffers from key challenges in terms of accurate classification and detection. The core idea behind recognizing activities accurately is to assist Internet-of-things (IoT) enabled smart surveillance systems. Thereby, this work is based on the joint use of discrete wavelet transform (DWT) and recurrent neural network (RNN) to classify and detect human activities accurately. Recent approaches on HAR exploit the three-dimensional (3-D) convolutional neural networks (CNNs) to extract spatial information, which adds a computational burden. In our case, features are extracted using 3D-DWT instead of 3-D CNNs, performed in three steps of 1D-DWT to reflect the spatio-temporal features of human action. Given the features, the RNN produces an output label for each video clip taking care of the long-term temporal consistency among close predictions in the output sequence. It is noticed that feature extraction through 3D-DWT essentially recovers the multiple angles of an activity. Many HAR techniques distinguish an activity based on the posture of an image frame rather than learning the transitional relationship between postures in the temporal sequence, resulting in degraded accuracy. To address this problem, in this article, we designed a novel rank-based fuzzy approach that segregates activities precisely by ranking the probabilities of activities based on confidence scores. FuzzyAct achieved an average mean average precision (mAP) of 0.8012 mAP on the ActivityNet dataset, and outperformed the baseline counterparts and other state-of-the-art approaches on benchmark datasets. Finally, we present a mechanism to compress the proposed RNN for edge-enabled IoT applications.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available