期刊
BUILDINGS
卷 13, 期 7, 页码 -出版社
MDPI
DOI: 10.3390/buildings13071831
关键词
text mining; information entropy TF-IDF; latent Dirichlet allocation (LDA); accident investigation report; construction industry
Construction accident investigation reports are difficult to analyze due to the voluminous Chinese text. To overcome this problem, a novel approach combining text mining techniques and LDA models is proposed to identify the key factors leading to safety accidents in the Chinese construction industry.
Construction accident investigation reports contain critical information, but extracting useful insights from the voluminous Chinese text is challenging. Traditional methods rely on expert judgment, which leads to time-consuming and potentially inaccurate results. To overcome this problem, we propose a novel approach that combines text mining techniques and latent Dirichlet allocation (LDA) models to analyze standardized accident investigation reports in the Chinese construction industry. The proposed method integrates an information entropy term frequency-inverse document frequency (TF-IDF) weighting scheme to evaluate term importance and accounts for word and model uncertainty. The method was applied to a set of construction industry accident reports to identify the key factors leading to safety accidents. The results show that the causal factors of accidents in Chinese accident investigation reports consist of keywords and negative expressions, including failure to timely identify safety hazards and inadequate site safety management. Failure to timely identify safety hazards is the most common factor in accident investigation reports, and the negative expressions commonly used in the reports include not timely and not in place. The information entropy TF-IDF method is superior to traditional methods in terms of accuracy and efficiency, and the LDA model that considers word frequency and feature weights is better able to capture the underlying themes in the Chinese corpus. And the subject terms that make up the themes contain more information about the causes of accidents. This approach helps site managers more quickly and effectively understand the causal factors and key messages that lead to accidents from incident reports. It gives site managers insight into common patterns and themes associated with safety incidents, such as unsafe practices, hazardous work environments, and non-compliance with safety regulations. This enables them to make informed decisions to improve safety management practices.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据