4.7 Article

Application of structural topic modeling to aviation safety data

Journal

RELIABILITY ENGINEERING & SYSTEM SAFETY
Volume 224, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.ress.2022.108522

Keywords

Machine learning; Structural topic modeling; Safety; ASRS; NTSB; Accident; Incident

Ask authors/readers for more resources

This paper presents an application of structural topic modeling (STM) to analyze aviation safety data. The results show that STM has the potential to identify themes within technical datasets, with improved performance for more specific corpora that use precise and unique language.
Data-driven frameworks for analyzing aviation safety data have recently gained traction. Text-based machine learning techniques often rely purely on word frequency analysis to eliminate the innate subjectivity of human language, but more refined techniques like structural topic modeling (STM) attempt to simulate text generation to identify the thematic undertones of text corpora. This paper presents an application of STM to two text-based sets of aviation safety data, the Aviation Safety Reporting System (ASRS) and accident and incident reports published by the National Transportation Safety Board (NTSB). A framework for cleaning and pre-processing the datasets is discussed, including a brief discussion of bag-of-words and TF???IDF representations of narratives. The methodology behind STM is described, including techniques for selecting the optimal number of topics. The results of the STM analysis on the ASRS and NTSB datasets are presented, with a focus on the clarity and specificity based on most common words associated with topics. A brief exploration of the correlation between pairs of topic labels is also undertaken, including a visualization of narratives in 2-dimensional space. STM is found to show promise in identifying themes within technical datasets, with model performance increasing for more specific corpora that use precise and unique language.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available