☆ 4.7 Article

Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2017)

期刊

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

卷 18, 期 1, 页码 49-58

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TITS.2016.2521866

关键词

Text mining; feature selection; fault diagnosis; railway systems

类别

Engineering, Civil Engineering, Electrical & Electronic Transportation Science & Technology

资金

National Natural Science Foundation of China [61473029]
U.S. NSF [CMMI-1162482]
Railway Ministry of Science Technology Research and Development Program [2014X008-A]
State Key Laboratory of Rail Traffic Control and Safety of Beijing Jiaotong University [RCS2016ZT010, RCS2014ZT27]
FDCT (Fundo para o Desenvolvimento das Ciencias e da Tecnologia) [119/2014/A3]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

A vast amount of text data is recorded in the forms of repair verbatim in railway maintenance sectors. Efficient text mining of such maintenance data plays an important role in detecting anomalies and improving fault diagnosis efficiency. However, unstructured verbatim, high-dimensional data, and imbalanced fault class distribution pose challenges for feature selections and fault diagnosis. We propose a bilevel feature extraction-based text mining that integrates features extracted at both syntax and semantic levels with the aim to improve the fault classification performance. We first perform an improved chi(2) statistics-based feature selection at the syntax level to overcome the learning difficulty caused by an imbalanced data set. Then, we perform a prior latent Dirichlet allocation-based feature selection at the semantic level to reduce the data set into a low-dimensional topic space. Finally, we fuse fault features derived from both syntax and semantic levels via serial fusion. The proposed method uses fault features at different levels and enhances the precision of fault diagnosis for all fault classes, particularly minority ones. Its performance has been validated by using a railway maintenance data set collected from 2008 to 2014 by a railway corporation. It outperforms traditional approaches.

Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems

期刊

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems

期刊

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文