4.7 Article

Linking big models to big data: efficient ecosystem model calibration through Bayesian model emulation

Journal

BIOGEOSCIENCES
Volume 15, Issue 19, Pages 5801-5830

Publisher

COPERNICUS GESELLSCHAFT MBH
DOI: 10.5194/bg-15-5801-2018

Keywords

-

Funding

  1. National Science Foundation (NSF) [1318164, 1241891]
  2. NASA Terrestrial Ecosystems
  3. NSF (ABI) [1062547, 1458021]
  4. Energy Biosciences Institute
  5. Amazon AWS education grant
  6. USDA Forest Service's Northern Research Station
  7. National Science Foundation [1114804]
  8. Northeastern States Research Cooperative
  9. DOE NICCR [DE-FC02-06ER64157]
  10. NSF (DIBBS) [1261582]
  11. Office of Advanced Cyberinfrastructure (OAC)
  12. Direct For Computer & Info Scie & Enginr [1261582] Funding Source: National Science Foundation

Ask authors/readers for more resources

Data-model integration plays a critical role in assessing and improving our capacity to predict ecosystem dynamics. Similarly, the ability to attach quantitative statements of uncertainty around model forecasts is crucial for model assessment and interpretation and for setting field research priorities. Bayesian methods provide a rigorous data assimilation framework for these applications, especially for problems with multiple data constraints. However, the Markov chain Monte Carlo (MCMC) techniques underlying most Bayesian calibration can be prohibitive for computationally demanding models and large datasets. We employ an alternative method, Bayesian model emulation of sufficient statistics, that can approximate the full joint posterior density, is more amenable to parallelization, and provides an estimate of parameter sensitivity. Analysis involved informative priors constructed from a meta-analysis of the primary literature and specification of both model and data uncertainties, and it introduced novel approaches to autocorrelation corrections on multiple data streams and emulating the sufficient statistics surface. We report the integration of this method within an ecological workflow management software, Predictive Ecosystem Analyzer (PEcAn), and its application and validation with two process-based terrestrial ecosystem models: SIPNET and ED2. In a test against a synthetic dataset, the emulator was able to retrieve the true parameter values. A comparison of the emulator approach to standard bruteforce MCMC involving multiple data constraints showed that the emulator method was able to constrain the faster and simpler SIPNET model's parameters with comparable performance to the brute-force approach but reduced computation time by more than 2 orders of magnitude. The emulator was then applied to calibration of the ED2 model, whose complexity precludes standard (brute-force) Bayesian data assimilation techniques. Both models are constrained after assimilation of the observational data with the emulator method, reducing the uncertainty around their predictions. Performance metrics showed increased agreement between model predictions and data. Our study furthers efforts toward reducing model uncertainties, showing that the emulator method makes it possible to efficiently calibrate complex models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available