4.7 Article

Stable Prediction With Leveraging Seed Variable

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2022.3169333

关键词

Conditional independence; separation; seed variables; stable prediction

向作者/读者索取更多资源

This paper addresses the problem of stable prediction across unknown test data, where the test distribution might be different from the training one. An algorithm based on conditional independence tests is proposed to screen out non-causal features and reduce spurious correlations by leveraging a seed variable, increasing the stability of prediction across unknown test data. Experimental results demonstrate that the proposed algorithm outperforms state-of-the-art methods for stable prediction across unknown test data.
In this paper, we focus on the problem of stable prediction across unknown test data, where the test distribution might be different from the training one and is always agnostic when model training. In such a case, previous machine learning methods might exploit subtly spurious correlations induced by non-causal variables in training data for prediction. Those spurious correlations can vary across datasets, leading to instability of prediction across unknown test data. To address this problem, we propose an algorithm based on conditional independence tests to screen out non-causal features and reduce spurious correlations by leveraging a seed variable. We show, both theoretically and with empirical experiments, that our algorithm can precisely screen out the isolated non-causal variables, which have no causal relationship with other variables, and remove the spurious correlations induced by them, increasing the stability of prediction across unknown test data. Extensive experiments on both synthetic and real-world datasets demonstrate that our algorithm outperforms state-of-the-art methods for stable prediction across unknown test data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据