4.7 Article

Evaluating statistical model performance in water quality prediction

Journal

JOURNAL OF ENVIRONMENTAL MANAGEMENT
Volume 206, Issue -, Pages 910-919

Publisher

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
DOI: 10.1016/j.jenvman.2017.11.049

Keywords

Water quality prediction; E. coli; Statistical models; Bayesian networks

Funding

  1. ESR Postgraduate scholarship
  2. UC Connect Doctoral Scholarship (The University of Canterbury, New Zealand)

Ask authors/readers for more resources

Exposure to contaminated water while swimming or boating or participating in other recreational activities can cause gastrointestinal and respiratory disease. It is not uncommon for water bodies to experience rapid fluctuations in water quality, and it is therefore vital to be able to predict them accurately and in time so as to minimise population's exposure to pathogenic organisms. E. coli is commonly used as an indicator to measure water quality in freshwater, and higher counts of E. coil are associated with increased risk to illness. In this case study, we compare the performance of a wide range of statistical models in prediction of water quality via E. coli levels for the weekly data collected over the summer months from 2006 to 2014 at the recreational site on the Oreti river in Wallacetown, New Zealand. The models include naive model, multiple linear regression, dynamic regression, regression tree, Markov chain, classification tree, random forests, multinomial logistic regression, discriminant analysis and Bayesian network. The results show that Bayesian network was superior to all the other models. Overall, it had a leave-one-out and k-fold cross validation error rate of 21%, while predicting the majority of instances of E. coli levels classified as unsafe by the Microbiological Water Quality Guidelines for Marine and Freshwater Recreational Areas 2003, New Zealand. Because Bayesian networks are also flexible in handling missing data and outliers and allow for continuous updating in real time, we have found them to be a promising tool, and in the future, plan to extend the analysis beyond the current case study site. (C) 2017 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available