4.7 Article

Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance

期刊

出版社

ELSEVIER
DOI: 10.1016/j.jwpe.2021.102033

关键词

Machine learning; ANN; RF; GBM; Feature selection; Total nitrogen

向作者/读者索取更多资源

This study evaluated the effect of seven different Feature Selection methods on enhancing the prediction accuracy for total nitrogen in wastewater treatment plants. The results showed that scenario IV suggested by Mutual Information had the best performance. In addition, Gradient Boosting Machine demonstrated the best performance on unseen data-set, indicating its effectiveness for wastewater components prediction.
Wastewater characteristics prediction in wastewater treatment plants (WWTPs) is valuable and can reduce the number of sampling, energy, and cost. Feature Selection (FS) methods are used in the pre-processing section for enhancing the model performance. This study aims to evaluate the effect of seven different FS methods (filter, wrapper, and embedded methods) on enhancing the prediction accuracy for total nitrogen (TN) in the WWTP influent flow. Four scenarios based on FS suggestions were defined and compared by three supervised Machine Learning (ML) algorithms, i.e. Artificial Neural Network (ANN), Random Forest (RF), and Gradient Boosting Machine (GBM). Input parameters, as daily time-series including pH, DO, COD, BOD, MLSS, MLVSS, NH4-N, and TN concentration, were used. Data set divided into train and unseen test data-sets, and performance precision of all models was carried out based on Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and correlation coefficient (R2). Results reveal that scenario IV which was suggested by Mutual Information, including NH4-N, COD, BOD, and DO had the best result rather than other FS methods. Furthermore, decision tree algorithms (RF and GBM) revealed better performance results in comparison to neural network algorithm (ANN). GBM generalized the dataset patterns very well and produced the best performance on unseen data-set, which shows the effectiveness of this state-of-the-art ML algorithm for wastewater components prediction.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据