4.7 Article

Privacy-Preserving Image Captioning with Deep Learning and Double Random Phase Encoding

Journal

MATHEMATICS
Volume 10, Issue 16, Pages -

Publisher

MDPI
DOI: 10.3390/math10162859

Keywords

image captioning; deep learning; privacy preserving; double random phase encoding; deep neural networks

Categories

Funding

  1. National Research Foundation of Korea (NRF) - Korean government (MSIT) [2020R1A2C3006234]
  2. Institute of Information & Communications Technology Planning & Evaluation (IITP) - Korean government (MSIT) [2020-0-00126]
  3. National Research Foundation of Korea [2020R1A2C3006234] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

Ask authors/readers for more resources

With the increasing amount of data being produced daily, cloud storage has become important. To protect privacy, this paper proposes a method that generates encrypted image captions in the cloud using a double random phase encoding (DRPE) encryption scheme, and evaluates the predicted captions using BLEU, METEOR, ROUGE, and CIDEr metrics.
Cloud storage has become eminent, with an increasing amount of data being produced daily; this has led to substantial concerns related to privacy and unauthorized access. To secure privacy, users can protect their private data by uploading encrypted data to the cloud. Data encryption allows computations to be performed on encrypted data without the data being decrypted in the cloud, which requires enormous computation resources and prevents unauthorized access to private data. Data analysis such as classification, and image query and retrieval can preserve data privacy if the analysis is performed using encrypted data. This paper proposes an image-captioning method that generates captions over encrypted images using an encoder-decoder framework with attention and a double random phase encoding (DRPE) encryption scheme. The images are encrypted with DRPE to protect them and then fed to an encoder that adopts the ResNet architectures to generate a fixed-length vector of representations or features. The decoder is designed with long short-term memory to process the features and embeddings to generate descriptive captions for the images. We evaluate the predicted captions with BLEU, METEOR, ROUGE, and CIDEr metrics. The experimental results demonstrate the feasibility of our privacy-preserving image captioning on the popular benchmark Flickr8k dataset.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available