4.6 Article

Predicting high-frequency variation in stream solute concentrations with water quality sensors and machine learning

Journal

HYDROLOGICAL PROCESSES
Volume 35, Issue 1, Pages -

Publisher

WILEY
DOI: 10.1002/hyp.14000

Keywords

biogeochemistry; machine learning; stream solutes; water quality

Funding

  1. Directorate for Biological Sciences [1256696, 1637685]
  2. Office of Integrative Activities [1101245]
  3. Division Of Environmental Biology
  4. Direct For Biological Sciences [1637685] Funding Source: National Science Foundation
  5. Division Of Environmental Biology
  6. Direct For Biological Sciences [1256696] Funding Source: National Science Foundation

Ask authors/readers for more resources

The study used machine learning algorithms to predict solute concentrations, finding that Random Forest algorithm performed slightly better than the Support Vector Machine algorithm. The most sensitive factors for predicting solute concentrations were the removal of fluorescent dissolved organic matter, pH, and specific conductance, while dissolved oxygen and turbidity were the least sensitive.
Stream solute monitoring has produced many insights into ecosystem and Earth system functions. Although new sensors have provided novel information about the fine-scale temporal variation of some stream water solutes, we lack adequate sensor technology to gain the same insights for many other solutes. We used two machine learning algorithms - Support Vector Machine and Random Forest - to predict concentrations at 15-min resolution for 10 solutes, of which eight lack specific sensors. The algorithms were trained with data from intensive stream sensing and manual stream sampling (weekly) for four full years in a hydrologic reference stream within the Hubbard Brook Experimental Forest in New Hampshire, USA. The Random Forest algorithm was slightly better at predicting solute concentrations than the Support Vector Machine algorithm (Nash-Sutcliffe efficiencies ranged from 0.35 to 0.78 for Random Forest compared to 0.29 to 0.79 for Support Vector Machine). Solute predictions were most sensitive to the removal of fluorescent dissolved organic matter, pH and specific conductance as independent variables for both algorithms, and least sensitive to dissolved oxygen and turbidity. The predicted concentrations of calcium and monomeric aluminium were used to estimate catchment solute yield, which changed most dramatically for aluminium because it concentrates with stream discharge. These results show great promise for using a combined approach of stream sensing and intensive stream discrete sampling to build information about the high-frequency variation of solutes for which an appropriate sensor or proxy is not available.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available