4.7 Article

Data splitting strategies for improving data driven models for reference evapotranspiration estimation among similar stations

Journal

COMPUTERS AND ELECTRONICS IN AGRICULTURE
Volume 162, Issue -, Pages 70-81

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.compag.2019.03.030

Keywords

Data driven models; Evapotranspiration; Ancillary inputs; Gene expression programming

Funding

  1. University of Tabriz

Ask authors/readers for more resources

In the last years, different heuristic data driven models have been proposed to estimate reference evapotranspiration (ETo) with high performance accuracy as an alternative to empirical and physically-based approaches. However, these models, despite their complexity and soundness, rely on finite data series, like the empirical approaches, and their actual practical validity highly depend on the data management adopted in their development and assessment, in particular on the data splitting adopted. A major issue for ensuring a sound assessment of the heuristic model performance is the definition of a suitable criterion for splitting the data series in training and testing data. The present study evaluates new different data set splitting strategies based on the adoption of ancillary external inputs for enhancing the performance of the Gene Expression Programming- based models for estimating ETo. All models are assessed using k-fold validation considering annual test sizes. The results show that it is preferable to incorporate the external target variable as input to feed the new model, rather than to incorporate the original external input variables of the model. Regarding the external performance of the models, it is crucial to select a suitable training station for each testing station for providing accurate enough estimations. This way, the applicability of such approaches is not limited to local emergency models, but it allows estimating ETo elsewhere without the need of training previously a local model using local targets. Finally, it is important to select properly which station/s will provide external ancillary ETo inputs to the training process, because otherwise they introduce noise to the model and decrease their generalizability.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available