4.6 Article

Privacy Preserving Attribute-Focused Anonymization Scheme for Healthcare Data Publishing

Journal

IEEE ACCESS
Volume 10, Issue -, Pages 86979-86997

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3199433

Keywords

Anonymization; data privacy; data publishing; healthcare data; privacy-preserving

Ask authors/readers for more resources

Advancements in Industry 4.0 have brought significant improvements to the healthcare sector. However, sharing healthcare data while preserving privacy is challenging due to security concerns. This paper presents an attribute-focused privacy preserving data publishing scheme that combines fixed-interval and improved l-diverse slicing approaches. Experimental results show improved accuracy and reduced information loss compared to existing methods. The proposed scheme provides data utility while protecting against various privacy breaches.
Advancements in Industry 4.0 brought tremendous improvements in the healthcare sector, such as better quality of treatment, enhanced communication, remote monitoring, and reduced cost. Sharing healthcare data with healthcare providers is crucial for harnessing the benefits of such improvements. In general, healthcare data holds sensitive information about individuals. Hence, sharing such data is challenging because of various security and privacy issues. According to privacy regulations and ethical requirements, it is essential to preserve the privacy of patients before sharing data for medical research. State-of-the-art literature on privacy preserving studies either uses cryptographic approaches to protect the privacy or uses anonymizing techniques regardless of the type of attributes, this results in poor protection and data utility. In this paper, we propose an attribute-focused privacy preserving data publishing scheme. The proposed scheme is two-fold, comprising a fixed-interval approach to protect numerical attributes and an improved l-diverse slicing approach to protect the categorical and sensitive attributes. In the fixed-interval approach, the original values of the healthcare data are replaced with an equivalent computed value. The improved l-diverse slicing approach partitions the data both horizontally and vertically to avoid privacy leaks. Extensive experiments with real-world datasets are conducted to evaluate the performance of the proposed scheme. The classification models built on anonymized dataset yields approximately 13% better accuracy than benchmarked algorithms. Experimental analyses show that the average information loss which is measured by normalized certainty penalty (NCP) is reduced by 12% compared to similar approaches. The attribute focused scheme not only provides data utility but also prevents the data from membership disclosures, attribute disclosures, and identity disclosures.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available