☆ 4.7 Article

Detecting adversarial examples via prediction difference for deep neural networks

INFORMATION SCIENCES (2019)

期刊

INFORMATION SCIENCES

卷 501, 期 -, 页码 182-192

出版社

ELSEVIER SCIENCE INC

DOI: 10.1016/j.ins.2019.05.084

关键词

Deep neural network; Adversarial example; Image recognition; Prediction difference

类别

Computer Science, Information Systems

资金

National Natural Science Foundation of China [61876019]
China Scholarship Council [201706035021]
German Research Foundation in project Crossmodal Learning [TRR-169]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Deep neural networks (DNNs) perform effectively in many computer vision tasks. However, DNNs are found to be vulnerable to adversarial examples which are generated by adding imperceptible perturbations to original images. To address this problem, we propose a novel defense method, transferability prediction difference (TPD), to drastically improve the adversarial robustness of DNNs with small sacrificing verified accuracy. We find out that the adversarial examples have lager prediction difference for various DNN models due to their various complicated decision boundaries, which can be used to identify the adversarial examples by converging decision boundaries to a prediction difference threshold. We adopt the K-means clustering algorithm on benign data to determine transferability prediction difference threshold, by which we can detect adversarial examples accurately and efficiently. Furthermore, TPD method neither modifies the target model nor needs to take knowledge of adversarial attacks. We perform four state-of-the-art adversarial attacks (FGSM, BIM, JSMA and C&W) to evaluate TPD models trained on MNIST and CIFAR-10 and the average detection accuracy is 96.74% and 86.61%. The results show that TPD model has high detection ratio on the demonstrably advanced white-box adversarial examples while keeping low false positive rate on benign examples. (C) 2019 Elsevier Inc. All rights reserved.

Detecting adversarial examples via prediction difference for deep neural networks

期刊

INFORMATION SCIENCES

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Detecting adversarial examples via prediction difference for deep neural networks

期刊

INFORMATION SCIENCES

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文