4.8 Article

Correlated Differential Privacy: Feature Selection in Machine Learning

期刊

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
卷 16, 期 3, 页码 2115-2124

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TII.2019.2936825

关键词

Correlation; Differential privacy; Sensitivity; Feature extraction; Machine learning; Machine learning algorithms; Differential privacy; data correlation; feature selection; machine learning

资金

  1. National Natural Science Foundation of China [61972366]
  2. Australian Research Council [LP170100123]
  3. Ministry of Education, Humanities, and Social Science Project of China [19A 10520035]
  4. Australian Research Council [LP170100123] Funding Source: Australian Research Council

向作者/读者索取更多资源

Privacy preserving in machine learning is a crucial issue in industry informatics since data used for training in industries usually contain sensitive information. Existing differentially private machine learning algorithms have not considered the impact of data correlation, which may lead to more privacy leakage than expected in industrial applications. For example, data collected for traffic monitoring may contain some correlated records due to temporal correlation or user correlation. To fill this gap, in this article, we propose a correlation reduction scheme with differentially private feature selection considering the issue of privacy loss when data have correlation in machine learning tasks. The proposed scheme involves five steps with the goal of managing the extent of data correlation, preserving the privacy, and supporting accuracy in the prediction results. In this way, the impact of data correlation is relieved with the proposed feature selection scheme, and moreover the privacy issue of data correlation in learning is guaranteed. The proposed method can be widely used in machine learning algorithms, which provide services in industrial areas. Experiments show that the proposed scheme can produce better prediction results with machine learning tasks and fewer mean square errors for data queries compared to existing schemes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据