4.7 Article

Spatial Multivariate Trees for Big Data Bayesian Regression

Journal

JOURNAL OF MACHINE LEARNING RESEARCH
Volume 23, Issue -, Pages -

Publisher

MICROTOME PUBL

Keywords

Directed acyclic graph; Gaussian process; Geostatistics; Multivariate regression; Markov chain Monte Carlo; Multiscale/multiresolution

Funding

  1. European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme [856506]
  2. United States National Institutes of Health (NIH) [R01ES028804]

Ask authors/readers for more resources

This study proposes a Bayesian multivariate regression model based on spatial multivariate trees (SPAMTREES) that achieves scalability by assuming conditional independence. The model is illustrated using a real climate dataset, demonstrating its effectiveness.
High resolution geospatial data are challenging because standard geostatistical models based on Gaussian processes are known to not scale to large data sizes. While progress has been made towards methods that can be computed more efficiently, considerably less attention has been devoted to methods for large scale data that allow the description of complex relationships between several outcomes recorded at high resolutions by different sensors. Our Bayesian multivariate regression models based on spatial multivariate trees (SPAMTREES) achieve scalability via conditional independence assumptions on latent random effects following a treed directed acyclic graph. Information-theoretic arguments and considerations on computational efficiency guide the construction of the tree and the related efficient sampling algorithms in imbalanced multivariate settings. In addition to simulated data examples, we illustrate SPAMTREES using a large climate data set which combines satellite data with land-based station data. Software and source code are available on CRAN at https://CRAN.R-project.org/package=spaintree.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available