☆ 4.6 Article

Defeating Misclassification Attacks Against Transfer Learning

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING (2023)

期刊

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING

卷 20, 期 2, 页码 886-901

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TDSC.2022.3144988

关键词

Transfer learning; Task analysis; Mathematical models; Training; Computational modeling; Data models; Perturbation methods; Deep neural network; defence against adversarial examples; transfer learning; pre-trained model

类别

Computer Science, Hardware & Architecture Computer Science, Information Systems Computer Science, Software Engineering

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Transfer learning is a technique to generate new models efficiently using knowledge from pre-trained models. However, the availability of pre-trained models introduces vulnerabilities to severe attacks in transfer learning systems. This article presents a defense strategy to mitigate misclassification attacks in transfer learning by designing a distilled differentiator and adopting an ensemble structure. The defense strategy achieves high immunity to adversarial inputs with minimal accuracy loss.

Transfer learning is prevalent as a technique to efficiently generate new models (Student models) based on the knowledge transferred from a pre-trained model (Teacher model). However, Teacher models are often publicly available for sharing and reuse, which inevitably introduces vulnerability to trigger severe attacks against transfer learning systems. In this article, we take a first step towards mitigating one of the most advanced misclassification attacks in transfer learning. We design a distilled differentiator via activation-based network pruning to enervate the attack transferability while retaining accuracy. We adopt an ensemble structure from variant differentiators to improve the defence robustness. To avoid the bloated ensemble size during inference, we propose a two-phase defence, in which inference from the Student model is first performed to narrow down the candidate differentiators to be assembled, and later only a small, fixed number of them can be chosen to validate clean or reject adversarial inputs effectively. Our comprehensive evaluations on both large and small image recognition tasks confirm that the Student models with our defence of only 5 differentiators are immune to over 90% of the adversarial inputs with an accuracy loss of less than 10%. Our comparison also demonstrates that our design outperforms prior problematic defences.

Defeating Misclassification Attacks Against Transfer Learning

期刊

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING

出版社

IEEE COMPUTER SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Defeating Misclassification Attacks Against Transfer Learning

期刊

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING

出版社

IEEE COMPUTER SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文