☆ 4.8 Article

Improving the performance of machine learning models for early warning of harmful algal blooms using an adaptive synthetic sampling method

WATER RESEARCH (2021)

期刊

WATER RESEARCH

卷 207, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.watres.2021.117821

关键词

Harmful algal blooms; Alert level; ADASYN; Machine learning; Early warning

类别

Engineering, Environmental Environmental Sciences Water Resources

资金

Basic Science Research Program through the National Research Foundation of Korea (NRF) - Ministry of Science, ICT & Future Planning [NRF-2019R1C1C1011366]
Korea Institute of Energy Technology Evaluation and Planning (KETEP)
Ministry of Trade, Industry and Energy (MOTIE) [20194010201900]
Korea Institute of Energy Technology Evaluation & Planning (KETEP) [20194010201900] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The study aimed to predict alert levels of algal blooms using machine learning models and address data imbalance by generating synthetic data. Results showed that the combined use of original and synthetic data improved prediction performance of the models, particularly for critical alert levels. The application of synthetic data significantly enhanced detection performance of the machine learning models in predicting algal bloom alert levels.

Many countries have attempted to monitor and predict harmful algal blooms to mitigate related problems and establish management practices. The current alert system-based sampling of cell density is used to intimate the bloom status and to inform rapid and adequate response from water-associated organizations. The objective of this study was to develop an early warning system for cyanobacterial blooms to allow for efficient decision making prior to the occurrence of algal blooms and to guide preemptive actions regarding management practices. In this study, two machine learning models: artificial neural network (ANN) and support vector machine (SVM), were constructed for the timely prediction of alert levels of algal bloom using eight years' worth of meteorological, hydrodynamic, and water quality data in a reservoir where harmful cyanobacterial blooms frequently occur during summer. However, the proportion imbalance on all alert level data as the output variable leads to biased training of the data-driven model and degradation of model prediction performance. Therefore, the synthetic data generated by an adaptive synthetic (ADASYN) sampling method were used to resolve the imbalance of minority class data in the original data and to improve the prediction performance of the models. The results showed that the overall prediction performance yielded by the caution level (L1) and warning level (L2) in the models constructed using a combination of original and synthetic data was higher than the models constructed using original data only. In particular, the optimal ANN and SVM constructed using a combination of original and synthetic data during both training (including validation) and test generated distinctively improved recall and precision values of L1, which is a very critical alert level as it indicates a transition status from normalcy to bloom formation. In addition, both optimal models constructed using synthetic-added data exhibited improvement in recall and precision by more than 33.7% while predicting L-1 and L-2 during the test. Therefore, the application of synthetic data can improve detection performance of machine learning models by solving the imbalance of observed data. Reliable prediction by the improved models can be used to aid the design of management practices to mitigate algal blooms within a reservoir.

Improving the performance of machine learning models for early warning of harmful algal blooms using an adaptive synthetic sampling method

期刊

WATER RESEARCH

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving the performance of machine learning models for early warning of harmful algal blooms using an adaptive synthetic sampling method

期刊

WATER RESEARCH

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文