4.5 Article

Normalizing flows for conditional independence testing

期刊

出版社

SPRINGER LONDON LTD
DOI: 10.1007/s10115-023-01964

关键词

Conditional independence; Hypothesis testing; Representation learning; Generative models; Normalizing flows; Mixed data

向作者/读者索取更多资源

In this study, a novel method called LCIT (Latent representation-based Conditional Independence Test) is introduced for testing conditional independence based on representation learning. LCIT first learns to infer the latent representations of target variables X and Y that contain no information about conditioning variable Z, and then investigates the latent variables for any significant remaining dependencies using a conventional correlation test. LCIT outperforms several state-of-the-art baselines consistently and adapts well to both nonlinear, high-dimensional, and mixed data settings on a diverse collection of synthetic and real data sets.
Detecting conditional independencies plays a key role in several statistical and machine learning tasks, especially in causal discovery algorithms, yet it remains a highly challenging problem due to dimensionality and complex relationships presented in data. In this study, we introduce LCIT (Latent representation-based Conditional Independence Test)-a novel method for conditional independence testing based on representation learning. Our main contribution involves a hypothesis testing framework in which to test for the independence between X and Y given Z, we first learn to infer the latent representations of target variables X and Y that contain no information about the conditioning variable Z. The latent variables are then investigated for any significant remaining dependencies, which can be performed using a conventional correlation test. Moreover, LCIT can also handle discrete and mixed-type data in general by converting discrete variables into the continuous domain via variational dequantization. The empirical evaluations show that LCIT outperforms several state-of-the-art baselines consistently under different evaluation metrics, and is able to adapt really well to both nonlinear, high-dimensional, and mixed data settings on a diverse collection of synthetic and real data sets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据