4.6 Article

Applying additive modelling and gradient boosting to assess the effects of watershed and reach characteristics on riverine assemblages

期刊

METHODS IN ECOLOGY AND EVOLUTION
卷 3, 期 1, 页码 116-128

出版社

WILEY
DOI: 10.1111/j.2041-210X.2011.00124.x

关键词

benthic macroinvertebrates; diversity; fish; generalized additive models; richness; spatial autocorrelation; streams

类别

资金

  1. Smithsonian Institution
  2. US Environmental Protection Agency National Center for Environmental Research (NCER) [R831369]
  3. Interdisciplinary Center for Clinical Research (IZKF) at the University Hospital of the University of Erlangen-Nuremberg [J11]
  4. EPA [908959, R831369] Funding Source: Federal RePORTER

向作者/读者索取更多资源

1. Issues with ecological data (e.g. non-normality of errors, nonlinear relationships and autocorrelation of variables) and modelling (e.g. overfitting, variable selection and prediction) complicate regression analyses in ecology. Flexible models, such as generalized additive models (GAMs), can address data issues, and machine learning techniques (e.g. gradient boosting) can help resolve modelling issues. Gradient boosted GAMs do both. Here, we illustrate the advantages of this technique using data on benthic macroinvertebrates and fish from 1573 small streams in Maryland, USA. 2. We assembled a predictor matrix of 15 watershed attributes (e. g. ecoregion and land use), 15 stream attributes (e. g. width and habitat quality) and location (latitude and longitude). We built boosted and conventionally estimated GAMs for macroinvertebrate richness and for the relative abundances of macroinvertebrates in the Orders Ephemeroptera, Plecoptera and Trichoptera (% EPT); individuals that cling to substrate (% Clingers); and individuals in the collector/gatherer functional feeding group (% Collectors). For fish, models were constructed for taxonomic richness, benthic species richness, biomass and the relative abundance of tolerant individuals (% Tolerant Fish). 3. For several of the responses, boosted GAMs had lower pseudo R-s(2) than conventional GAMs for in-sample data but larger pseudo R-s(2) for out-of-bootstrap data, suggesting boosted GAMs do not overfit the data and have higher prediction accuracy than conventional GAMs. The models explained most variation in fish richness (pseudo R-2 = 0 97), least variation in % Clingers (pseudo R-2 = 0 28) and intermediate amounts of variation in the other responses (pseudo R(2)s between 0.41 and 0.60). Many relationships of macroinvertebrate responses to anthropogenic measures and natural watershed attributes were nonlinear. Fish responses were related to system size and local habitat quality. 4. For impervious surface, models predicted below model-average macroinvertebrate richness at levels above c.3 0%, lower % EPT above c. 1 5%, and lower % Clingers for levels above c.2 0%. Impervious surface did not affect% Collectors or any fish response. Prediction functions for% EPT and fish richness increased linearly with log(10) (watershed area), % Tolerant Fish decreased with log(10) (watershed area), and benthic fish richness and biomass both increased nonlinearly with log(10) (watershed area). 5. Gradient boosting optimizes the predictive accuracy of GAMs while preserving the structure of conventional GAMs, so that predictor-response relationships are more interpretable than with other machine learning methods. Boosting also avoids overfitting the data (by shrinking effect estimates towards zero and by performing variable selection), thus avoiding spurious predictor effects

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据