4.6 Article

Causal network inference in a dam system and its implications on feature selection for machine learning forecasting

Journal

Publisher

ELSEVIER
DOI: 10.1016/j.physa.2022.127893

Keywords

Causality; Causal network; Dam system; Water supply; Machine learning

Funding

  1. Philippine Department of Science and Technology (DOST) [8419]

Ask authors/readers for more resources

This study applies causal inference methods to analyze the causal network of a dam system and demonstrates the consistency of the convergent cross mapping (CCM) method in simulating real dam systems. The results reveal the lagged effects between water levels, climate, weather, and water demands, and provide an approach for selecting input variables in predictive modeling.
A fundamental goal across many research fields is to explain possible mechanisms be-hind a phenomenon and infer the correct causal relationships between variables. In this work, we employed various causal inference methods to derive the causal network of a dam system from time series data. Here we explored the lagged effects of water levels in two dams, climate and weather variables, and domestic and agricultural water demands on each other. Among the methods considered, we demonstrated that convergent cross mapping (CCM), a method for inferring causal relationships in complex systems using time series data, is the most consistent with an actual dam system: (1) causal links were consistent with the direction of the physical flow of water, (2) the effects of climate and weather variables were successfully captured, (3) the time lags shed light on the dynamics of the dam system and possibly reflected planning schedules which are not explicit in the data. Our results captured both intuitive and counter-intuitive causal links, some of which were validated by domain experts. Using the resulting causal links to pre-select the input variables in machine learning-based forecasting models significantly reduces the prediction errors compared to using randomly selected features. Specifically, the best reduction in MAE is 4.2-4.4 meters, which corresponds to an improvement of 2.8-3.0 times lower than using random selection of features. CCM was also able to filter the top 20 significant predictors, where further addition of other variables yielded negligible improvement in the MAE. This is the first work that demonstrates successful inference of time-lagged causal network of endogenous and exogenous variables in a dam system. (C) 2022 Published by Elsevier B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available