4.7 Article

Water quality classification using machine learning algorithms

Journal

JOURNAL OF WATER PROCESS ENGINEERING
Volume 48, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.jwpe.2022.102920

Keywords

Water quality; Water quality index; Machine learning; Stack modelling; Meta classifier; Ensemble models

Funding

  1. Research Institute of Sciences and Engineering (RISE) of the University of Sharjah, UAE
  2. Bio Sensing Group

Ask authors/readers for more resources

This study evaluates various artificial intelligence algorithms to develop a reliable approach for accurately predicting water quality. The CATBoost model is found to be the most accurate classifier, with an accuracy of 94.51%. Furthermore, by applying stacking ensemble models with all classifiers, the accuracy reaches 100% in various Meta-classifiers. Therefore, the boosting algorithm is proposed as a reliable approach for water quality classification.
Monitoring water quality is essential for protecting human health and the environment and controlling water quality. Artificial Intelligence (AI) offers significant opportunities to help improve the classification and prediction of water quality (WQ). In this study, various AI algorithms are assessed to handle WQ data collected over an extended period and develop a dependable approach for forecasting water quality as accurately as possible. Specifically, various machine learning classifiers and their stacking ensemble models were used to classify the WQ data via the Water Quality Index (WQI). The studied classifiers included Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), Decision Tree (DT), CATBoost, XGBoost, and Multilayer Perceptron (MLP). The dataset used in the study included 1679 samples and their meta-data collected over nine years. In addition, precision-recall curves and Receiver Operating Characteristic curves (ROC) were used to assess the performance of the various classifiers. The findings revealed that the CATBoost model offered the most accurate classifier with a percentage of 94.51. Moreover, after applying stacking ensemble models with all classifiers, accuracy reached 100% in various Meta-classifiers. Furthermore, the CATBoost achieved the highest accuracy as a primary gradient boosting algorithm and a meta classifier. Therefore, the boosting algorithm is proposed as a reliable approach for the WQ classification. The analysis presented in this article presents a framework that can support the efforts of researchers working toward water quality improvement using artificial intelligence.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available