4.5 Article

Adversarial example detection based on saliency map features

期刊

APPLIED INTELLIGENCE
卷 52, 期 6, 页码 6262-6275

出版社

SPRINGER
DOI: 10.1007/s10489-021-02759-8

关键词

Machine learning; Adversarial example detection; Interpretability; Saliency map

资金

  1. National Defense Basic Scientific Research Program of China [JCKY2018603B006]

向作者/读者索取更多资源

In recent years, machine learning has significantly enhanced image recognition capabilities, but has also revealed vulnerabilities in neural network models to adversarial examples. By utilizing interpretability methods to reveal internal decision-making behaviors of models, researchers were able to propose an effective method for detecting adversarial examples based on multilayer saliency features. Experimental results demonstrated the method's capability to effectively detect adversarial examples across various attack scenarios, comparable to state-of-the-art methods.
In recent years, machine learning has greatly improved image recognition capability. However, studies have shown that neural network models are vulnerable to adversarial examples that make models output wrong answers with high confidence. To understand the vulnerabilities of models, we use interpretability methods to reveal the internal decision-making behaviors of models. Interpretation results reflect that the evolutionary process of nonnormalized saliency maps between clean and adversarial examples are increasingly differentiated along model hidden layers. By taking advantage of this phenomenon, we propose an adversarial example detection method based on multilayer saliency features, which can comprehensively capture the abnormal characteristics of adversarial example interpretations. Experimental results show that the proposed method can effectively detect adversarial examples based on gradient, optimization and black-box attacks, and it is comparable with the state-of-the-art methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据