☆ 4.7 Article

Transformer-based deep learning model and video dataset for unsafe action identification in construction projects

AUTOMATION IN CONSTRUCTION (2023)

期刊

AUTOMATION IN CONSTRUCTION

卷 146, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.autcon.2022.104703

关键词

Action recognition; Construction safety; Transformer; Deep learning

类别

Construction & Building Technology Engineering, Civil

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper presents a deep learning model called Spatial Temporal Relation Transformer (STR-Transformer) for automatically identifying risky behaviors in construction sites. By simultaneously extracting and fusing spatial and temporal features from video streams, the STR-Transformer enables more accurate and reliable safety surveillance, with potential to reduce accident rates and management costs.

A large proportion of construction accidents are caused by unintentional and unsafe actions and behaviors. It is of significant difficulties and ineffectiveness to monitor unsafe behaviors using conventional manual supervision due to the complex and dynamic working conditions on construction sites. Recently, surveillance videos and computer vision (CV) techniques have been increasingly adopted to automatically identify risky behaviors. However, the challenge remains that spatial and temporal features in video clips cannot be effectively captured and fused by current CV models. To address this challenge, this paper describes a deep learning model named Spatial Temporal Relation Transformer (STR-Transformer), where spatial and temporal features of work behaviors are simultaneously extracted in paralleling video streams and then fused by a specially designed module. To verify the effectiveness of the STR-Transformer, a customized dataset is developed, including seven categories of construction worker behaviors and 1595 video clips. In numerical experiments and case studies, the STR-Transformer achieves an average precision of 88.7%, 4.0% higher than the baseline model. The STR-Transformer enables more accurate and reliable automatic safety surveillance on construction projects, and is expected to reduce accident rates and management costs. Moreover, the performance of STR-Transformer relies on efficient feature integration, which may inspire future studies to identify, extract, and fuse richer features when applying CV-based deep learning models in construction management.

Transformer-based deep learning model and video dataset for unsafe action identification in construction projects

期刊

AUTOMATION IN CONSTRUCTION

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Transformer-based deep learning model and video dataset for unsafe action identification in construction projects

期刊

AUTOMATION IN CONSTRUCTION

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文