☆ 4.5 Article

Adversarial example detection based on saliency map features

APPLIED INTELLIGENCE (2022)

期刊

APPLIED INTELLIGENCE

卷 52, 期 6, 页码 6262-6275

出版社

SPRINGER

DOI: 10.1007/s10489-021-02759-8

关键词

Machine learning; Adversarial example detection; Interpretability; Saliency map

类别

Computer Science, Artificial Intelligence

资金

National Defense Basic Scientific Research Program of China [JCKY2018603B006]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In recent years, machine learning has significantly enhanced image recognition capabilities, but has also revealed vulnerabilities in neural network models to adversarial examples. By utilizing interpretability methods to reveal internal decision-making behaviors of models, researchers were able to propose an effective method for detecting adversarial examples based on multilayer saliency features. Experimental results demonstrated the method's capability to effectively detect adversarial examples across various attack scenarios, comparable to state-of-the-art methods.

In recent years, machine learning has greatly improved image recognition capability. However, studies have shown that neural network models are vulnerable to adversarial examples that make models output wrong answers with high confidence. To understand the vulnerabilities of models, we use interpretability methods to reveal the internal decision-making behaviors of models. Interpretation results reflect that the evolutionary process of nonnormalized saliency maps between clean and adversarial examples are increasingly differentiated along model hidden layers. By taking advantage of this phenomenon, we propose an adversarial example detection method based on multilayer saliency features, which can comprehensively capture the abnormal characteristics of adversarial example interpretations. Experimental results show that the proposed method can effectively detect adversarial examples based on gradient, optimization and black-box attacks, and it is comparable with the state-of-the-art methods.

Adversarial example detection based on saliency map features

期刊

APPLIED INTELLIGENCE

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Adversarial example detection based on saliency map features

期刊

APPLIED INTELLIGENCE

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文