4.7 Review

Random forests for global sensitivity analysis: A selective review

Journal

RELIABILITY ENGINEERING & SYSTEM SAFETY
Volume 206, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.ress.2020.107312

Keywords

Random forests; Global sensitivity analysis

Ask authors/readers for more resources

Understanding physical and engineering problems often requires running complex computational models with a high number of input variables. Global sensitivity analysis (GSA) methods help identify influential inputs, with meta-modeling and random forests providing efficient non-parametric approaches to sensitivity analysis. This allows for dimension reduction and effective quantification of variable importance, even in high-dimensional data sets.
The understanding of many physical and engineering problems involves running complex computational models. Such models take as input a high number of numerical and physical explanatory variables. The information on these underlying input parameters is often limited or uncertain. It is therefore important, based on the relationships between the input variables and the output, to identify and prioritize the most influential inputs. One may use global sensitivity analysis (GSA) methods which aim at ranking input random variables according to their importance in the output uncertainty, or even quantify the global influence of a particular input on the output. Using sensitivity metrics to ignore less important parameters is a form of dimension reduction in the model's input parameter space. This suggests the use of meta-modeling as a quantitative approach for nonparametric GSA, where the original input/output relation is first approximated using various statistical regression techniques. Subsequently, the main goal of our work is to provide a comprehensive review paper in the domain of sensitivity analysis focusing on some interesting connections between random forests and GSA. The idea is to use a random forests methodology as an efficient non-parametric approach for building meta-models that allow an efficient sensitivity analysis. Apart its easy applicability to regression problems, the random forests approach presents further strong advantages by its ability to implicitly deal with correlation and high dimensional data, to handle interactions between variables and to identify informative inputs using a permutation based RF variable importance index which is easy and fast to compute. We further review an adequate set of tools for quantifying variable importance which are then exploited to reduce the model's dimension enabling otherwise infeasible sensibility analysis studies. Numerical results from several simulations and a data exploration on a real dataset are presented to illustrate the effectiveness of such an approach.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available