4.5 Article

Learning transferable targeted universal adversarial perturbations by sequential meta-learning

期刊

COMPUTERS & SECURITY
卷 137, 期 -, 页码 -

出版社

ELSEVIER ADVANCED TECHNOLOGY
DOI: 10.1016/j.cose.2023.103584

关键词

Targeted adversarial attacks; Model-agnostic meta-learning; Data-free universal adversarial perturbations; Transfer-based black-box attacks

向作者/读者索取更多资源

In this study, we aim to learn targeted universal adversarial perturbations (UAPs) with higher transferability by ensembling multiple models. We propose a normalized logit loss to narrow the margin of the targeted class's logits among different models and introduce a novel sequential meta-learning optimization strategy to further increase transferability. Experimental results demonstrate the superiority of our approach over existing ensemble attacks in both white box and black-box settings.
Recently, the transferability of adversarial perturbations in non-targeted scenarios has been extensively studied. However, changing the predictions of an unknown model to a pre-defined 'targeted' class still remains challenging. In this study, we aim to learn the targeted universal adversarial perturbations (UAPs) with higher transferability by the ensemble of multiple models. First, we observe the phenomenon that the logit of the target class will bias to a specific white-box model in existing ensemble-based attacks. To deal with the issue, we propose a normalized logit loss to narrow the margin of the targeted class's logits among different models. Besides, we introduce a novel sequential meta-learning optimization strategy to further increase transferability, consisting of the inner loop and the outer loop. In the inner loop, we sequentially learn task-specific targeted UAPs for each source model by jointly considering the perturbation from the previous model. In the outer loop, we optimize the task-agnostic targeted UAP by combining the targeted UAPs from the inner loop. Experimental results demonstrate the mutual benefits of the normalized logit loss and the sequential meta-learning optimization strategy for learning targeted adversarial perturbations, outperforming existing ensemble attacks in both white box and black-box settings. The source code of this study is available at: Link.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据