4.5 Article

An algorithm for the microaggregation problem using column generation

Journal

COMPUTERS & OPERATIONS RESEARCH
Volume 144, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.cor.2022.105817

Keywords

Integer programming; Column generation; Data privacy; Clustering; Microaggregation

Funding

  1. ERASMUS+ Traineeship Program
  2. Italian Ministry for Education, University, and Research [PRIN 2015B5F27W]
  3. European Union [ITN 764759]
  4. CNR Short-Term Mobility program [CNR-STM14-PROT26328]
  5. Spanish Ministry of Science, Innovation and Universities [MCIU/AEI/FEDER RTI2018097580-B-I00]

Ask authors/readers for more resources

The field of statistical disclosure control is focused on reducing the risk of re-identifying individuals from disseminated data. Previous research has mainly focused on protecting tabular data, while less attention has been given to protecting microdata containing individual records and attributes. This paper proposes a new heuristic approach based on column generation to address the microaggregation problem, which is known to be NP-hard and one of the best methods for microdata protection. The computational results show promising performance compared to existing heuristics in the literature.
The field of statistical disclosure control aims to reduce the risk of re-identifying an individual from disseminated data, a major concern among national statistical agencies. Operations Research (OR) techniques have been widely used in the past for protecting tabular data, but not microdata (i.e., files of individuals and attributes). Few papers apply OR techniques to the microaggregation problem, which is considered one of the best methods for microdata protection and is known to be NP-hard. The new heuristic approach is based on a column generation scheme and, unlike previous (primal) heuristics for microaggregation, it also provides a lower bound on the optimal microaggregation. Using real data that is typically used in the literature, our computational results show, first, that solutions with small gaps are often achieved and, second, that dramatic improvements are obtained relative to the literature's most popular heuristics.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available