4.6 Article

The DetectDeviatingCells algorithm was a useful addition to the toolkit for cellwise error detection in observational data

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY
卷 157, 期 -, 页码 35-45

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.jclinepi.2023.02.015

关键词

Error detection; Outlier; Data quality; DetectDeviatingCells; Mahalanobis distance; Robust statistics

向作者/读者索取更多资源

This study evaluated the performance of the DetectDeviatingCells (DDC) algorithm in detecting data anomalies at the observation and variable level in continuous variables. The DDC algorithm showed promising results in improving error detection processes for observational data, particularly in detecting complex error patterns.
Objectives: We evaluated the error detection performance of the DetectDeviatingCells (DDC) algorithm which flags data anomalies at observation (casewise) and variable (cellwise) level in continuous variables. We compared its performance to other approaches in a simulated dataset.Study Design and Setting: We simulated height and weight data for hypothetical individuals aged 2-20 years. We changed a proportion of height values according to predetermined error patterns. We applied the DDC algorithm and other error-detection approaches (descriptive statistics, plots, fixed-threshold rules, classic, and robust Mahalanobis distance) and we compared error detection performance with sensitivity, specificity, likelihood ratios, predictive values, and receiver operating characteristic (ROC) curves.Results: At our chosen thresholds error detection specificity was excellent across all scenarios for all methods and sensitivity was higher for multivariable and robust methods. The DDC algorithm performance was similar to other robust multivariable methods. Analysis of ROC curves suggested that all methods had comparable performance for gross errors (e.g., wrong measurement unit), but the DDC algorithm outperformed the others for more complex error patterns (e.g., transcription errors that are still plausible, although extreme).Conclusions: The DDC algorithm has the potential to improve error detection processes for observational data. (c) 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据