Journal
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
Volume 27, Issue 4, Pages 910-922Publisher
TAYLOR & FRANCIS INC
DOI: 10.1080/10618600.2018.1473780
Keywords
Data visualization; Exploratory data analysis; Heatmap; Multivariate data
Categories
Funding
- Air Force Office of Scientific Research [FA9550-14-1-0016]
- National Human Genome Research Institute [1U01HG007031-01]
- National Science Foundation [CDSE-MSS 1228246, DMS-1107000, DMS-1160319, CCF-0939370]
Ask authors/readers for more resources
The technological advancements of the modern era have enabled the collection of huge amounts of data in science and beyond. Extracting useful information from such massive datasets is an ongoing challenge as traditional data visualization tools typically do not scale well in high-dimensional settings. An existing visualization technique that is particularly well suited to visualizing large datasets is the heatmap. Although heatmaps are extremely popular in fields such as bioinformatics, they remain a severely underutilized visualization tool in modern data analysis. This article introduces superheat, a new R package that provides an extremely flexible and customizable platform for visualizing complex datasets. Superheat produces attractive and extendable heatmaps to which the user can add a response variable as a scatterplot, model results as boxplots, correlation information as barplots, and more. The goal of this article is two-fold: (1) to demonstrate the potential of the heatmap as a core visualization method for a range of data types, and (2) to highlight the customizability and ease of implementation of the superheat R package for creating beautiful and extendable heatmaps. The capabilities and fundamental applicability of the superheat package will be explored via three reproducible case studies, each based on publicly available data sources.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available