☆ 4.7 Article

An evaluation of the statistical methods for testing the performance of crop models with observed data

AGRICULTURAL SYSTEMS (2014)

Journal

AGRICULTURAL SYSTEMS

Volume 127, Issue -, Pages 81-89

Publisher

ELSEVIER SCI LTD

DOI: 10.1016/j.agsy.2014.01.008

Keywords

Statistical evaluation; Test statistics; Deviation statistics; Autocorrelation; Heteroskedasticity; Crop simulation model

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Calibration and evaluation are two important steps prior to the application of a crop simulation model. The objective of this paper was to review common statistical methods that are being used for crop model calibration and evaluation. A group of deviation statistics were reviewed, including root mean squired error (RMSE), normalize-RMSE (nRMSE), mean absolute error (MAE), mean error (E), paired-t, index of agreement (d), modified index of agreement (d(1)), revised index of agreement (d(1)'), modeling efficiency (EF) and revised modeling efficiency (EF1). A case study of the statistical evaluation was conducted for the DSSAT Cropping System Model (CSM) using 10 experimental datasets for maize, peanut, soybean, wheat and potato from Brazil, China, Ghana, and the USA. The results indicated that R-2 was not a good statistic for model evaluation because it is insensitive to regression coefficients (alpha and beta) of the linear model y = alpha + beta x + epsilon. However, linear regression can be used for model evaluation (test H0: alpha = 0, beta = 1) if auto-correlation, normality and heteroskedasticaity of the error term (epsilon) are tested or the proper data transfers are made. The results also illustrated that statistical evaluation of total dataset across treatments might be insufficient. Hence the evaluation of each treatment is necessary to make the right conclusion, especially when evaluating soil water content under different planting date treatments and soil mineral N under different N treatments. Co-variability analysis among dimensionless statistics (d, d(1), d(1)', EF and EF1) recommended that d and EF are inflated by the sum of squares-based deviations, i.e., the larger deviations contribute more weight on the statistic than the smaller deviation due to the squared term. However, EF had a larger range and a clear physical meaning at EF = 0, making it superior to d. Values of d = 0.75 were obtained from regression with all positive values of EF (EF >= 0), indicating that values of d >= 0.75 and EF >= 0 should be the minimum values for plant growth evaluation. Values of d >= 0.60 and EF >= -1.0 should be the minimum values for soil outputs evaluation combined with t-test due to the fact that the soil parameters in the DSSAT SOIL module are difficult to calibrate compared with plant growth parameters because of no sufficient observed soil dataset. Due to the statistical nature, no single statistic is more robust over others but some statistics are highly correlated. Therefore, several statistics may be used from each of the following correlated groups (RMSE, MAE), (E, t-test), (d, d(1), d(1)') and (EF, EF1) in one assessment of model evaluation so that a representative statistical conclusion can be obtained with respect to model performance. (C) 2014 Elsevier Ltd. All rights reserved.

An evaluation of the statistical methods for testing the performance of crop models with observed data

Journal

AGRICULTURAL SYSTEMS

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

An evaluation of the statistical methods for testing the performance of crop models with observed data

Journal

AGRICULTURAL SYSTEMS

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper