☆ 4.6 Article

Incremental k-Anonymous Microaggregation in Large-Scale Electronic Surveys with Optimized Scheduling

IEEE ACCESS (2018)

期刊

IEEE ACCESS

卷 6, 期 -, 页码 60016-60044

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2018.2875949

关键词

Data privacy; statistical disclosure control; k-anonymity; microaggregation; electronic surveys; large-scale datasets

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

资金

Spanish Ministry of Industry, Energy and Tourism
Project Data-Distortion Framework (DDF)'' [TSI-100202-2013-23]
Spanish Ministry of Economy and Competitiveness through the Project Anonymized Demographic Surveys [TIN2014-58259-JIN]
Funding Program Proyectos de I+D+i para Jovenes Investigadores through the Project INRISCO [TEC2014-54335-C4-1-R]
Project Advanced Forensic Analysis [TEC2015-68734-R]
Agencia de Gestio d'Ajuts Universitaris i de Recerca of the Government of Catalonia [SGR 2014-1504, SGR 2017-782]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Improvements in technology have led to enormous volumes of detailed personal information made available for any number of statistical studies. This has stimulated the need for anonymization techniques striving to attain a difficult compromise between the usefulness of the data and the protection of our privacy. The k-anonymous microaggregation permits releasing a dataset where each person remains indistinguishable from other k - 1 individuals, through the aggregation of demographic attributes, otherwise a potential culprit for respondent reidentification. Although privacy guarantees are by no means absolute, the elegant simplicity of the k-anonymity criterion and the excellent preservation of information utility of microaggregation algorithms has turned them into widely popular approaches whenever data utility is critical. Unfortunately, high-utility algorithms on large datasets inherently require extensive computation. This paper addresses the need of running k-anonymous microaggregation efficiently with mild distortion loss, exploiting the fact that the data may arrive over an extended period of time. Specifically, we propose to split the original dataset into two portions that will be processed subsequently, allowing the first process to start before the entire dataset is received while leveraging the superlinearity of the involved microaggregation algorithms. A detailed mathematical formulation enables us to calculate the optimal time for the fastest anonymization as well as for minimum distortion under a given deadline. Two incremental microaggregation algorithms are devised, for which extensive experimentation is reported. The presented theoretical methodology should prove invaluable in numerous data-collection applications, including large-scale electronic surveys in which computation is possible as the data come in.

Incremental k-Anonymous Microaggregation in Large-Scale Electronic Surveys with Optimized Scheduling

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Incremental k-Anonymous Microaggregation in Large-Scale Electronic Surveys with Optimized Scheduling

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文