4.3 Article

Using genetic algorithms for attribute grouping in multivariate microaggregation

Journal

INTELLIGENT DATA ANALYSIS
Volume 18, Issue 5, Pages 819-836

Publisher

IOS PRESS
DOI: 10.3233/IDA-140670

Keywords

Genetic clustering algorithms; multivariate microaggregation; attribute selection

Funding

  1. Ministry of Science and Technology of Spain [TIN2012-34557]
  2. BSC-CNS Severo Ochoa program [SEV-2011-00067]

Ask authors/readers for more resources

Anonymization techniques that provide k-anonymity suffer from loss of quality when data dimensionality is high. Microaggregation techniques are not an exception. Given a set of records, attributes are grouped into non-intersecting subsets and microaggregated independently. While this improves quality by reducing the loss of information, it usually leads to the loss of the k-anonymity property, increasing entity disclosure risk. In spite of this, grouping attributes is still a common practice for data sets containing a large number of records. Depending on the attributes chosen and their correlation, the amount of information loss and disclosure risk vary. However, there have not been serious attempts to propose a way to find the best way of grouping attributes. In this paper, we present GOMM, the Genetic Optimizer for Multivariate Microaggregation which, as far as we know, represents the first proposal using evolutionary algorithms for this problem. The goal of GOMM is finding the optimal, or near-optimal, attribute grouping taking into account both information loss and disclosure risk. We propose a way to map attribute subsets into a chromosome and a set of new mutation operations for this context. Also, we provide a comprehensive analysis of the operations proposed and we show that, after using our evolutionary approach for different real data sets, we obtain better quality in the anonymized data comparing it to previously used ad-hoc attribute grouping techniques. Additionally, we provide an improved version of GOMM called D-GOMM where operations are dynamically executed during the optimization process to reduce the GOMM execution time.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available