4.7 Article

Application of feature selection and regression models for chlorophyll-a prediction in a shallow lake

期刊

ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH
卷 25, 期 20, 页码 19488-19498

出版社

SPRINGER HEIDELBERG
DOI: 10.1007/s11356-018-2147-3

关键词

Feature selection; Random forest; Minimum redundancy and maximum relevance; Support vector machine

资金

  1. Tianjin Municipal Education Commission research project [2017KJ125]
  2. National Natural Science Foundation of China [41372373]
  3. innovation team training plan of the Tianjin Education Committee [TD12-5037]

向作者/读者索取更多资源

As a representative index of the algal bloom, the concentration of chlorophyll-a (Chl-a) is a key parameter of concern for environmental managers. The relationships between environmental variables and Chl-a are complex and difficult to establish. Two machine learning methods, including support vector machine for regression (SVR) and random forest (RF), were used in this study to predict Chl-a concentration based on multiple variables. To improve the model accuracy and reduce the input number, two feature selection methods, including minimum redundancy and maximum relevance method (mRMR) and RF, were integrated with regression models. The results showed that the RF model had a higher predictive ability than the SVR model. Furthermore, the less computational time cost and unnecessary prior data transformation also indicated a better applicability of the RF model. The comparison between ensemble models of mRMR-RF and RF-RF showed that the RF-RF yielded a better performance with fewer variables. Seven variables selected from the candidate predictors could interpret most information, and their potential implications to Chl-a were discussed based on the level of importance. Overall, the RF-RF ensemble model can be considered as a useful approach to determine the significant stressors and achieve satisfactory prediction of Chl-a concentration.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据