4.5 Article

Visualizing Variable Importance and Variable Interaction Effects in Machine Learning Models

Journal

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
Volume 31, Issue 3, Pages 766-778

Publisher

TAYLOR & FRANCIS INC
DOI: 10.1080/10618600.2021.2007935

Keywords

Black-box; Model explanation; Model visualization

Funding

  1. Science Foundation Ireland Career Development Award [17/CDA/4695]
  2. Marine Research Programme - Irish Government
  3. European Regional Development Fund [PBA/CC/18/01]
  4. European Union's Horizon 2020 research and innovation programme InnoVar [818144]
  5. SFI Centre for Research Training in Foundations of Data Science [18CRT/6049]
  6. SFI Research Centre [16/RC/3872, 12/RC/2289_P2]
  7. [16/IA/4520]
  8. H2020 Societal Challenges Programme [818144] Funding Source: H2020 Societal Challenges Programme

Ask authors/readers for more resources

Variable importance, interaction measures, and partial dependence plots are important summaries in the interpretation of statistical and machine learning models. The new visualization techniques described in this article provide enhanced interpretation even in situations where the number of variables is large, and are applicable to regression and classification supervised learning settings. The visualizations are model-agnostic and carefully designed to highlight important aspects of the fit.
Variable importance, interaction measures, and partial dependence plots are important summaries in the interpretation of statistical and machine learning models. In this article, we describe new visualization techniques for exploring these model summaries. We construct heatmap and graph-based displays showing variable importance and interaction jointly, which are carefully designed to highlight important aspects of the fit. We describe a new matrix-type layout showing all single and bivariate partial dependence plots, and an alternative layout based on graph Eulerians focusing on key subsets. Our new visualizations are model-agnostic and are applicable to regression and classification supervised learning settings. They enhance interpretation even in situations where the number of variables is large. Our R package vivid (variable importance and variable interaction displays) provides an implementation. Supplementary files for this article are available online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available