4.8 Article

Developing a framework for classifying water lead levels at private drinking water systems: A Bayesian Belief Network approach

期刊

WATER RESEARCH
卷 189, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.watres.2020.116641

关键词

Bayesian Belief Network; Lead in Drinking Water; Contamination Classification; Water Quality

资金

  1. U.S. Environmental Protection Agency [8399375]

向作者/读者索取更多资源

This research aims to explore modeling approaches to predict the risk of lead in private drinking water systems, with results showing Naive Bayes classifiers performing best in terms of recall and precision. Copper is identified as the most significant predictor of lead, while feature selection has limited impact on performance and discretization methods can greatly influence model performance when paired with classifiers.
The presence of lead in drinking water creates a public health crisis, as lead causes neurological damage at low levels of exposure. The objective of this research is to explore modeling approaches to predict the risk of lead at private drinking water systems. This research uses Bayesian Network approaches to explore interactions among household characteristics, geological parameters, observations of tap water, and laboratory tests of water quality parameters. A knowledge discovery framework is developed by integrating methods for data discretization, feature selection, and Bayes classifiers. Forward selection and backward selection are explored for feature selection. Discretization approaches, including domain-knowledge, statistical, and information-based approaches, are tested to discretize continuous features. Bayes classifiers that are tested include General Bayesian Network, Naive Bayes, and Tree-Augmented Naive Bayes, which are applied to identify Directed Acyclic Graphs (DAGs). Bayesian inference is used to fit conditional probability tables for each DAG. The Bayesian framework is applied to fit models for a dataset collected by the Virginia Household Water Quality Program (VAHWQP), which collected water samples and conducted household surveys at 2,146 households that use private water systems, including wells and springs, in Virginia during 2012 and 2013. Relationships among laboratory-tested water quality parameters, observations of tap water, and household characteristics, including plumbing type, source water, household location, and on-site water treatment are explored to develop features for predicting water lead levels. Results demonstrate that Naive Bayes classifiers perform best based on recall and precision, when compared with other classifiers. Copper is the most significant predictor of lead, and other important predictors include county, pH, and on-site water treatment. Feature selection methods have a marginal effect on performance, and discretization methods can greatly affect model performance when paired with classifiers. Owners of private wells remain disadvantaged and may be at an elevated level of risk, because utilities and governing agencies are not responsible for ensuring that lead levels meet the Lead and Copper Rule for private wells. Insight gained from models can be used to identify water quality parameters, plumbing characteristics, and household variables that increase the likelihood of high water lead levels to inform decisions about lead testing and treatment. (C) 2020 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据