4.0 Article

Multiple Linear Regression and Machine Learning for Predicting the Drinking Water Quality Index in Al-Seine Lake

Journal

SMART CITIES
Volume 6, Issue 5, Pages 2807-2827

Publisher

MDPI
DOI: 10.3390/smartcities6050126

Keywords

Seine lake; machine learning; water quality; water quality index; evaluation; prediction

Ask authors/readers for more resources

Ensuring safe and clean drinking water is crucial, and this study evaluates different models to monitor and predict water quality. The MLR and ML models, such as linear regression and Bayesian ridge chain, performed well in predicting the water quality index (WQI) of Al-Seine Lake. The results support using these models for accurate water quality management.
Ensuring safe and clean drinking water for communities is crucial, and necessitates effective tools to monitor and predict water quality due to challenges from population growth, industrial activities, and environmental pollution. This paper evaluates the performance of multiple linear regression (MLR) and nineteen machine learning (ML) models, including algorithms based on regression, decision tree, and boosting. Models include linear regression (LR), least angle regression (LAR), Bayesian ridge chain (BR), ridge regression (Ridge), k-nearest neighbor regression (K-NN), extra tree regression (ET), and extreme gradient boosting (XGBoost). The research's objective is to estimate the surface water quality of Al-Seine Lake in Lattakia governorate using the MLR and ML models. We used water quality data from the drinking water lake of Lattakia City, Syria, during years 2021-2022 to determine the water quality index (WQI). The predictive performance of both the MLR and ML models was evaluated using statistical methods such as the coefficient of determination (R2) and the root mean square error (RMSE) to estimate their efficiency. The results indicated that the MLR model and three of the ML models, namely linear regression (LR), least angle regression (LAR), and Bayesian ridge chain (BR), performed well in predicting the WQI. The MLR model had an R2 of 0.999 and an RMSE of 0.149, while the three ML models had an R2 of 1.0 and an RMSE of approximately 0.0. These results support using both MLR and ML models for predicting the WQI with very high accuracy, which will contribute to improving water quality management.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.0
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available