4.7 Article

DanHAR: Dual Attention Network for multimodal human activity recognition using wearable sensors

期刊

APPLIED SOFT COMPUTING
卷 111, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.asoc.2021.107728

关键词

Human activity recognition; Multimodal sensors; Convolutional neural networks; Residual network; Channel attention

资金

  1. National Natural Science Foundation of China [61203237]
  2. Joint Project of Industry-University-Research of Jiangsu Province, China [BY2016001-02]
  3. Natural Science Foundation of Jiangsu Province, China [BK20191371]

向作者/读者索取更多资源

The paper introduces a new dual attention method called DanHAR, which combines channel and temporal attention on residual networks to enhance feature representation for sensor-based HAR tasks. Experimental results on multiple datasets demonstrate the central importance of dual attention mechanism in activity recognition tasks, showing performance improvement over other algorithms. Visualizing analysis confirms the ability of the proposed attention mechanism to capture spatial-temporal dependencies in multimodal sensing data.
In the paper, we present a new dual attention method called DanHAR, which blends channel and temporal attention on residual networks to improve feature representation ability for sensor-based HAR task. Specially, the channel attention plays a key role in deciding what to focus, i.e., sensor modalities, while the temporal attention can focus on the target activity from a long sensor sequence to tell where to focus. Extensive experiments are conducted on four public HAR datasets, as well as weakly labeled HAR dataset. The results show that dual attention mechanism is of central importance for many activity recognition tasks. We obtain 2.02%, 4.20%, 1.95%, 5.22% and 5.00% relative improvement over regular ConvNets respectively on WISDM dataset, UNIMIB SHAR dataset, PAMAP2 dataset, OPPORTUNITY dataset, as well as weakly labeled HAR dataset. The DanHAR is able to surpass other state-of-the-art algorithms at negligible computational overhead. Visualizing analysis is conducted to show that the proposed attention can capture the spatial-temporal dependencies of multimodal sensing data, which amplifies the more important sensor modalities and timesteps during classification. The results are in good agreement with normal human intuition. (C) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据