4.5 Article

Towards practical differential privacy in data analysis: Understanding the effect of epsilon on utility in private ERM

期刊

COMPUTERS & SECURITY
卷 128, 期 -, 页码 -

出版社

ELSEVIER ADVANCED TECHNOLOGY
DOI: 10.1016/j.cose.2023.103147

关键词

Differential privacy; Machine learning; Parameter selection

向作者/读者索取更多资源

As privacy-preserving data analysis becomes increasingly important, the trade-off between privacy guarantee and analysis accuracy has been a concern in practice. This paper focuses on private ERM and explores the effect of the differential privacy parameter c on the utility of the learning model. The proposed approach provides a practical way to estimate utility under different privacy requirements. The high estimation accuracy and broad applicability make it valuable in practical applications.
As computation over sensitive data has been an important goal in recent years, privacy-preserving data analysis has gradually attracted more and more attention. Among various mechanisms, differential pri-vacy has been widely studied due to its formal privacy guarantees for data analysis. As one of the most important issues, the crucial trade-off between the strength of privacy guarantee and the effect of anal-ysis accuracy is highly concerned among the researchers. Existing theories for this issue consider that the analyst should first choose a privacy requirement and then attempt to maximize the utility. How-ever, as differential privacy is gradually deployed in practice, a gap between theory and practice comes out: in practice, product requirements often impose hard accuracy constraints, and privacy (while desir-able) may not be the over-riding concern. Thus, it is usually that the requirement of privacy guarantee is adjusted according to the utility expectation, not the other way around. This gap raises the question of how to provide maximum privacy guarantee for data analysis due to a given accuracy requirement. In this paper, we focus our attention on private Empirical Risk Minimization (ERM), which is one of the most commonly used data analysis method. We take the first step towards solving the above problem by theoretically exploring the effect of c (the parameter of differential privacy that determines the strength of privacy guarantee) on utility of the learning model. We trace the change of utility with modification of c and reveal an established relationship between c and utility. We then formalize this relationship and propose a practical approach for estimating the utility under an arbitrary value of c. Both theoretical analysis and experimental results demonstrate high estimation accuracy and broad applicability of our approach in practical applications. As providing algorithms with strong utility guarantees that also give privacy when possible becomes more and more accepted, our approach would have high practical value and may be likely to be adopted by companies and organizations that would like to preserve privacy but are unwilling to compromise on utility.(c) 2023 Published by Elsevier Ltd.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据