4.7 Article

A robust fleet-based anomaly detection framework applied to wind turbine vibration data

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.engappai.2023.106859

关键词

Machine learning methods; Fault Detection models; Unsupervised learning; Condition monitoring system; Cross-validation; Receiver operating characteristic curve

向作者/读者索取更多资源

This paper presents a robust unsupervised machine-learning approach for fleet-based anomaly detection in wind turbines' critical components. The approach preprocesses and extracts features from noisy, unlabeled, and unstructured vibration data, and optimizes the performance of eleven machine learning algorithms. Six best models are selected based on robust performance metrics and achieve classification metrics above 90%.
Large amounts of unlabeled data are produced from wind turbine condition monitoring systems to catch their operational status. With this unmanageable amount of data, developing robust systems with good performance on unseen test data to detect incipient wind turbine faults is crucial to maximizing wind farm performance. This paper presents an implementation of a robust unsupervised machine-learning approach capable of executing fleet-based anomaly detection in wind turbines' critical components. The proposed methodology is applied to noisy, unlabeled, and unstructured vibration data, which must go through the databank decoding, data engineering, preprocessing, and feature extraction. Twelve operational wind turbines with varying health conditions are used to train, validate, and test the models. Features from different domains (time, frequency, and mechanical domain) are extracted and represented in the model's input. A labeling procedure from expert analysis regarding the condition of each wind turbine component through the evaluation of CMS output was carried out. Combining distinctive approaches to optimize eleven unsupervised machine learning algorithms through an unusual 5x2 cross-validation approach applied to real, noisy, and unstructured wind turbine data represents the paper's novelty. The methodology selected the six best models (k-nearest neighbors, clustering-based local outlier, histogram-based outlier, isolation forest, principal component analysis, and minimum covariance determinant) based on robust performance metrics such as accuracy, F1-score, precision, recall, and area under the ROC (Receiver Operating Characteristic Curve). These models generalized the problem well and returned reasonable classification metrics for such a complex problem, with values above 90% for the area under the ROC.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据