☆ 4.8 Article

Robust imputation method with context-aware voting ensemble model for management of water-quality data

WATER RESEARCH (2023)

期刊

WATER RESEARCH

卷 243, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.watres.2023.120369

关键词

Water quality; Missing data; Data imputation; Data quality; Data management

类别

Engineering, Environmental Environmental Sciences Water Resources

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Water-quality monitoring and management are crucial, but missing data is a frequent problem in water-quality datasets. Existing imputation methods often fail to robustly impute missing values across various scenarios. We propose a context-aware voting-ensemble model to dynamically select optimal weights and integrate various imputation models for more accurate results.

Water-quality monitoring and management are crucial for ensuring the safety and sustainability of water resources. However, missing data is a frequent problem in water-quality datasets, which can result in biased results in hydrological modeling and data analysis. While classic statistical methods and emerging machine/deep learning methods have been applied for imputing missing values, most existing studies perform well in specific missing scenarios, but not in universal scenarios. Therefore, existing imputation methods often fail to robustly impute missing values across various scenarios. To address the problem, we propose an imputation method that uses a context-aware voting-ensemble model to dynamically select optimal weights to integrate various imputation models across different missingness scenarios. For first identify the attributes of missingness scenarios that influence imputation accuracy. Then after introducing missing values in collected data according to the missingness scenarios, we measure the accuracy of various imputation models across the missingness scenarios. Weights of imputation models are optimized by estimating non-linear functions with regression model that can capture relationships between missingness scenarios and imputation accuracies of models. The final imputed value of the ensemble model for a missing scenario can be determined by multiplying each imputation model's weight by its imputed value, then summing the products. The method inherits the advantages of state-of-art imputation models, including the ability to learn long-term dependencies in time series, as well as the flexibility of using a dynamic weighting strategy to process various missingness scenarios. To validate the superiority of our method, we evaluate on real-world water-quality data from a river in South Korea. The proposed method achieves higher accuracy and lower variation of imputed values than baseline models across various missingness scenarios. Furthermore, we showed the applicability of our method to various hydrological environment by validating our method on industrial water quality dataset. This study highlights the potential value of the ensemble model with dynamic weighting in robust imputation of water-quality data.

Robust imputation method with context-aware voting ensemble model for management of water-quality data

期刊

WATER RESEARCH

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Robust imputation method with context-aware voting ensemble model for management of water-quality data

期刊

WATER RESEARCH

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文