4.7 Article

Transformer-based deep learning model and video dataset for unsafe action identification in construction projects

期刊

AUTOMATION IN CONSTRUCTION
卷 146, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.autcon.2022.104703

关键词

Action recognition; Construction safety; Transformer; Deep learning

向作者/读者索取更多资源

This paper presents a deep learning model called Spatial Temporal Relation Transformer (STR-Transformer) for automatically identifying risky behaviors in construction sites. By simultaneously extracting and fusing spatial and temporal features from video streams, the STR-Transformer enables more accurate and reliable safety surveillance, with potential to reduce accident rates and management costs.
A large proportion of construction accidents are caused by unintentional and unsafe actions and behaviors. It is of significant difficulties and ineffectiveness to monitor unsafe behaviors using conventional manual supervision due to the complex and dynamic working conditions on construction sites. Recently, surveillance videos and computer vision (CV) techniques have been increasingly adopted to automatically identify risky behaviors. However, the challenge remains that spatial and temporal features in video clips cannot be effectively captured and fused by current CV models. To address this challenge, this paper describes a deep learning model named Spatial Temporal Relation Transformer (STR-Transformer), where spatial and temporal features of work behaviors are simultaneously extracted in paralleling video streams and then fused by a specially designed module. To verify the effectiveness of the STR-Transformer, a customized dataset is developed, including seven categories of construction worker behaviors and 1595 video clips. In numerical experiments and case studies, the STR-Transformer achieves an average precision of 88.7%, 4.0% higher than the baseline model. The STR-Transformer enables more accurate and reliable automatic safety surveillance on construction projects, and is expected to reduce accident rates and management costs. Moreover, the performance of STR-Transformer relies on efficient feature integration, which may inspire future studies to identify, extract, and fuse richer features when applying CV-based deep learning models in construction management.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据