4.7 Article

Variable selection for estimating individual tree height using genetic algorithm and random forest

Journal

FOREST ECOLOGY AND MANAGEMENT
Volume 504, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.foreco.2021.119828

Keywords

Machine Learning; Optimization; Feature Selection; Forest Modelling; Mixed-Effect Model

Categories

Funding

  1. Brazilian power company - CEMIG
  2. Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior - Brasil (CAPES) [001]
  3. University of Lavras - UFLA

Ask authors/readers for more resources

A hybrid method combining genetic algorithms for variables selection and random forest for fitting models of individual tree heights was proposed and compared with other methods using a dataset of 5,608 trees and 189 environmental variables. The optimal set of variables included breast height diameter ratio, competition index, dominant height, soil silt, and boron content, with the proposed hybrid method achieving comparable accuracy in estimating tree heights. The study suggests that this modelling approach could have broader applications in forestry and ecological science.
Tree height is an important trait in forest science and is highly associated with the site quality from which the trees are measured. However, other factors, such as competition and species interaction, may yield better estimates for individual tree height when taken into account, but these variables have so far been challenging in model fitting. We propose a hybrid approach using genetic algorithms for variables selection and a machine learning algorithm (random forest) for fitting models of individual tree heights. We compare our proposed hybrid method with a mixed-effects model and random forest model using a dataset of 5,608 trees and 189 environmental variables (forest inventory-based variables, soil, topographic, climate, spectral, and geographic) from sites in southeastern Brazil. The tree height models were evaluated using the coefficient of determination, absolute bias, and root means square error (RMSE) based on the validation of dataset performance. The optimal set of variables of the proposed method include the ratio of diameter at breast height to quadratic mean diameter, distance independent competition index, dominant height, the soil silt and boron content. Our findings showed that the proposed hybrid method achieved an accuracy comparable with other methodologies in estimating the total height of the individual trees, and such a modelling approach could have broader applications in forestry and ecological science where a studied response trait has a large number of potential explanatory variables.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available