4.7 Article

Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions

期刊

JOURNAL OF HYDROLOGY
卷 574, 期 -, 页码 1029-1041

出版社

ELSEVIER
DOI: 10.1016/j.jhydrol.2019.04.085

关键词

Random forests; Support vector machine; Gradient boosting decision tree; Computational complexity; Model comparison; CatBoost

资金

  1. Foundation of Key Laboratory of Crop Water Use and Regulation, China [FIRI2018-01]
  2. National Natural Science Foundation of China [51709143, 61703199, 51790533]
  3. Natural Science Foundation of Jiangxi Province, China [20181BAB206045]

向作者/读者索取更多资源

Accurate estimation of reference evapotranspiration (ET0) is critical for water resource management and irrigation scheduling. This study evaluated the potential of a new machine learning algorithm using gradient boosting on decision trees with categorical features support (i.e., CatBoost) for accurately estimating daily ET0 with limited meteorological data in humid regions of China. Two other commonly used machine learning algorithms, Random Forests (RF) and Support Vector Machine (SVM), were also assessed for comparison. Eight input combinations of daily meteorological data [including both complete and incomplete combinations of solar radiation (R-s), maximum and minimum temperatures (T-max and T-min), relative humidity (H-r) and wind speed (U)] from five weather stations during 2001-2015 in South China were applied for model training and testing. The results showed that all the three algorithms could achieve satisfactory accuracy for ET0 estimation in subtropical China using R-s, T-max and T-min, or U, H-r, T-max, and T-min as inputs, under the circumstances of lacking complete meteorological parameters. The increases in testing RMSE and MAPE over training RMSE and MAPE showed positive correlations with the number of input parameters to the machine learning models. For the local models, among the three algorithms, SVM offered the best prediction accuracy and stability with incomplete combinations of meteorological parameters as inputs, while CatBoost performed best with the complete combination of parameters. Patterns of the generalized models were almost the same as the local models, but the former ones showed less than 10% decreases in RMSE or MAPE in comparison with the latter ones. In addition, the computing time and memory usage for data processing of CatBoost were much less than those of RF and SVM. Overall, as a tree-based algorithm, CatBoost made significant improvements in accuracy, stability and computational cost when compared to RF. Therefore, the CatBoost algorithm has a very high potential for ET0 estimation in humid regions of China, and even possibly in other parts of the world with similar humid climates.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据