Integrating Information Entropy and Latent Dirichlet Allocation Models for Analysis of Safety Accidents in the Construction Industry
Author(s): |
Yipeng Liu
Junwu Wang Shanrong Tang Jiaji Zhang Jinyingjun Wan |
---|---|
Medium: | journal article |
Language(s): | English |
Published in: | Buildings, 28 June 2023, n. 7, v. 13 |
Page(s): | 1831 |
DOI: | 10.3390/buildings13071831 |
Abstract: |
Construction accident investigation reports contain critical information, but extracting useful insights from the voluminous Chinese text is challenging. Traditional methods rely on expert judgment, which leads to time-consuming and potentially inaccurate results. To overcome this problem, we propose a novel approach that combines text mining techniques and latent Dirichlet allocation (LDA) models to analyze standardized accident investigation reports in the Chinese construction industry. The proposed method integrates an information entropy term frequency-inverse document frequency (TF-IDF) weighting scheme to evaluate term importance and accounts for word and model uncertainty. The method was applied to a set of construction industry accident reports to identify the key factors leading to safety accidents. The results show that the causal factors of accidents in Chinese accident investigation reports consist of keywords and negative expressions, including “failure to timely identify safety hazards” and “inadequate site safety management”. Failure to timely identify safety hazards is the most common factor in accident investigation reports, and the negative expressions commonly used in the reports include “not timely” and “not in place”. The information entropy TF-IDF method is superior to traditional methods in terms of accuracy and efficiency, and the LDA model that considers word frequency and feature weights is better able to capture the underlying themes in the Chinese corpus. And the subject terms that make up the themes contain more information about the causes of accidents. This approach helps site managers more quickly and effectively understand the causal factors and key messages that lead to accidents from incident reports. It gives site managers insight into common patterns and themes associated with safety incidents, such as unsafe practices, hazardous work environments, and non-compliance with safety regulations. This enables them to make informed decisions to improve safety management practices. |
Copyright: | © 2023 by the authors; licensee MDPI, Basel, Switzerland. |
License: | This creative work has been published under the Creative Commons Attribution 4.0 International (CC-BY 4.0) license which allows copying, and redistribution as well as adaptation of the original work provided appropriate credit is given to the original author and the conditions of the license are met. |
2.77 MB
- About this
data sheet - Reference-ID
10737124 - Published on:
03/09/2023 - Last updated on:
14/09/2023