4.7 Article

NoLeaks: Differentially Private Causal Discovery Under Functional Causal Model

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIFS.2022.3184263

Keywords

Optimization; Privacy; Sensitivity; Standards; Perturbation methods; Numerical models; Inference algorithms; Differential privacy; causal discovery; data privacy; data mining

Funding

  1. NVIDIA Academic Hardware Grant Program

Ask authors/readers for more resources

This paper proposes a new approach to enforce privacy for causal discovery algorithms based on functional causal models. It introduces a differentially private causal discovery algorithm called NoLeaks, along with a highly efficient numerical optimization algorithm. Evaluation results show that NoLeaks achieves comparable or even superior performance compared to the state-of-the-art approaches, and it can scale smoothly to large datasets.
Causal inference is widely used in clinical research, economic analysis, and other fields. As is the case with many statistical data, the findings of causal discovery (i.e., causal graph) might leak demographic information of participants. For example, a causal link between one genome and a rare disease can reveal the participation of a minority patient in genome-wide association studies. To date, differential privacy has served as the de facto foundation for guaranteeing the privacy of causal discovery algorithms. However, existing approaches to protecting causal discovery from privacy leakage rely heavily on private conditional independence tests, which generate a considerable amount of noise and are thus prone to inaccuracy. As a result of their limited accuracy and scalability, they are insufficient for non-trivial datasets (e.g., those with more than ten variables). In this paper, we advocate a novel focus on enforcing privacy for causal discovery algorithms based on functional causal models. First, we propose NoLeaks, a differentially private causal discovery algorithm, which manifests both high accuracy and efficiency compared with prior works. Second, we design a quasi-Newton numerical optimization algorithm for solving NoLeaks in a highly efficient way. Third, we evaluate NoLeaks using both public benchmarks and synthetic data. We observe that NoLeaks achieves comparable performance or even surpasses the state-of-the-art (non-private) approaches. We also find encouraging results that NoLeaks can smoothly scale to large datasets, on which existing works would fail. Through a case study and a downstream application, we observe encouraging results on the versatile usages of NoLeaks.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available