4.5 Article

Assessing Machine Learning Models for Gap Filling Daily Rainfall Series in a Semiarid Region of Spain

Journal

ATMOSPHERE
Volume 12, Issue 9, Pages -

Publisher

MDPI
DOI: 10.3390/atmos12091158

Keywords

gap-filling; rainfall series; machine learning; Bayesian optimization

Funding

  1. Spanish Ministry of Science, Innovation and Universities [AGL2017-87658-R]
  2. University of Cordoba: PIF scholarship

Ask authors/readers for more resources

This study assessed various machine learning models for filling in missing rainfall data, finding that using neighbor data within a 50 km radius outperformed other approaches. Results also showed improvements in inland areas compared to coastal areas, with the efficiency effects based on the distance to the sea being significant.
The presence of missing data in hydrometeorological datasets is a common problem, usually due to sensor malfunction, deficiencies in records storage and transmission, or other recovery procedures issues. These missing values are the primary source of problems when analyzing and modeling their spatial and temporal variability. Thus, accurate gap-filling techniques for rainfall time series are necessary to have complete datasets, which is crucial in studying climate change evolution. In this work, several machine learning models have been assessed to gap-fill rainfall data, using different approaches and locations in the semiarid region of Andalusia (Southern Spain). Based on the obtained results, the use of neighbor data, located within a 50 km radius, highly outperformed the rest of the assessed approaches, with RMSE (root mean squared error) values up to 1.246 mm/day, MBE (mean bias error) values up to -0.001 mm/day, and R-2 values up to 0.898. Besides, inland area results outperformed coastal area in most locations, arising the efficiency effects based on the distance to the sea (up to an improvement of 63.89% in terms of RMSE). Finally, machine learning (ML) models (especially MLP (multilayer perceptron)) notably outperformed simple linear regression estimations in the coastal sites, whereas in inland locations, the improvements were not such significant.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available