4.5 Article

Facing spatial massive data in science and society: Variable selection for spatial models

Journal

SPATIAL STATISTICS
Volume 50, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.spasta.2022.100627

Keywords

LASSO; Variable selection; Spatial statistics

Funding

  1. Agencia Nacional de Investigacion e Innovacion (ANII)
  2. Instituto Franco Uruguayo de Matematica e Interacciones (IFUMI)
  3. French Embassy in Uruguay

Ask authors/readers for more resources

This work focuses on variable selection for spatial regression models with irregular lattices and Conditional or Simultaneous Auto-Regressive (CAR or SAR) models for errors. The strategy is to whiten the residuals by estimating their spatial covariance matrix and then perform L1-penalized regression LASSO. The study proves the sign consistency for general dependent errors and provides conditions on the weight matrix of the SAR or CAR model to ensure the validity of the method.
This work focuses on variable selection for spatial regression models, with locations on irregular lattices and errors according to Conditional or Simultaneous Auto-Regressive (CAR or SAR) models. The strategy is to whiten the residuals by estimating their spatial covariance matrix and then proceed by performing the standard L1-penalized regression LASSO for independent data on the transformed model. A result is stated that proves the sign consistency for general dependent errors provided that the transformed design matrix fulfills standard assumptions for the LASSO procedure and that the estimate of the residual covariance matrix is consistent. Then sufficient conditions on the weight matrix of the SAR or CAR model are given that ensure those conditions hold. A simulation study is driven that shows this method gives good result in terms of variables selection, while some underestimation of the coefficients is noted. It is compared to a strategy that estimates both the regression and the covariance parameters in a LARS procedure. Coefficients are better estimated with the Least Angle Regression (LARS) procedure but it gives in some cases much more false positive in the variable selection. The application is on the regression of income data in rural area of Uruguay on a set of covariates describing socio-economic characteristics of the households. (c) 2022 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available