4.8 Article

Short-term Lake Erie algal bloom prediction by classification and regression models

期刊

WATER RESEARCH
卷 232, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.watres.2023.119710

关键词

Bloom forecast; Feature selection; Long-short term memory; Machine learning; Random forest; Time series modeling

向作者/读者索取更多资源

In order to improve control and management, the recent outbreaks of harmful algal blooms in the western Lake Erie Basin have received significant attention. A comprehensive literature review was conducted to address the limitations of existing models. A large dataset was compiled, and machine learning-based classification and regression models were built for 10-day scale bloom predictions. By analyzing feature importance, 8 key features for HAB control were identified. The models achieved high accuracy and the LSTM model provided short-term forecasts even without feature values.
The recent outbreaks of harmful algal blooms in the western Lake Erie Basin (WLEB) have drawn tremendous attention to bloom prediction for better control and management. Many weekly to annual bloom prediction models have been reported, but they only employ small datasets, have limited types of input features, build linear regression or probabilistic models, or require complex process-based computations. To address these limitations, we conducted a comprehensive literature review, complied a large dataset containing chlorophyll-a index (from 2002 to 2019) as the output and a novel combination of riverine (the Maumee & Detroit Rivers) and meteorological (WLEB) features as the input, and built machine learning-based classification and regression models for 10-d scale bloom predictions. By analyzing the feature importance, we identified 8 most important features for the HAB control, including nitrogen loads, time, water levels, soluble reactive phosphorus load, and solar irradiance. Here, both long- and short-term nitrogen loads were for the first time considered in HAB models for Lake Erie. Based on these features, the 2-, 3-, and 4-level random forest classification models achieved an accuracy of 89.6%, 77.0%, and 66.7%, respectively, and the regression model achieved an R2 value of 0.69. In addition, longshort term memory (LSTM) was implemented to predict temporal trends of four short-term features (N, solar irradiance, and two water levels) and achieved the Nash-Sutcliffe efficiency of 0.12-0.97. Feeding the LSTM model predictions for these features into the 2-level classification model reached an accuracy of 86.0% for predicting the HABs in 2017-2018, suggesting that we can provide short-term HAB forecasts even when the feature values are not available.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据