4.6 Article

Statistical testing under distributional shifts

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/jrsssb/qkad018

Keywords

distributional shifts; hypothesis testing; conditional independence testing; sampling importance resampling; causality

Ask authors/readers for more resources

This study introduces statistical testing under distributional shifts. It addresses the problem of testing a hypothesis about a target distribution when data is observed from a different distribution. The proposed method resamples from the observed data and applies an existing test in the target domain, inheriting its asymptotic level and power.
We introduce statistical testing under distributional shifts. We are interested in the hypothesis P*? H(0 )for a target distribution P*, but observe data from a different distribution Q*. We assume that P* is related to Q* through a known shift t and formally introduce hypothesis testing in this setting. We propose a general testing procedure that first resamples from the observed data to construct an auxiliary data set (similarly to sampling importance resampling) and then applies an existing test in the target domain. We prove that if the size of the resample is of order o(vn) and the resampling weights are well behaved, this procedure inherits the pointwise asymptotic level and power from the target test. If the map t is estimated from data, we maintain the above guarantees under mild conditions on the estimation. Our results extend to finite sample level, uniform asymptotic level, a different resampling scheme, and statistical inference different from testing. Testing under distributional shifts allows us to tackle a diverse set of problems. We argue that it may prove useful in contextual bandit problems and covariate shift, show how it reduces conditional to unconditional independence testing and provide example applications in causal inference.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available