Tampere University of Technology

TUTCRIS Research Portal

CIIDefence: Defeating Adversarial Attacks by Fusing Class-specific Image Inpainting and Image Denoising.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Details

Original languageEnglish
Title of host publication2019 International Conference on Computer Vision, ICCV 2019
PublisherIEEE
Pages6708-6717
ISBN (Electronic)9781728148038
DOIs
Publication statusPublished - 2019
Publication typeA4 Article in a conference publication
EventIEEE/CVF International Conference on Computer Vision -
Duration: 27 Oct 20192 Nov 2019

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
ISSN (Print)1550-5499

Conference

ConferenceIEEE/CVF International Conference on Computer Vision
Period27/10/192/11/19

Abstract

This paper presents a novel approach for protecting deep neural networks from adversarial attacks, i.e., methods that add well-crafted imperceptible modifications to the original inputs such that they are incorrectly classified with high confidence. The proposed defence mechanism is inspired by the recent works mitigating the adversarial disturbances by the means of image reconstruction and denoising. However, unlike the previous works, we apply the reconstruction only for small and carefully selected image areas that are most influential to the current classification outcome. The selection process is guided by the class activation map responses obtained for multiple top-ranking class labels. The same regions are also the most prominent for the adversarial perturbations and hence most important to purify. The resulting inpainting task is substantially more tractable than the full image reconstruction, while still being able to prevent the adversarial attacks. Furthermore, we combine the selective image inpainting with wavelet based image denoising to produce a non differentiable layer that prevents attacker from using gradient backpropagation. Moreover, the proposed nonlinearity cannot be easily approximated with simple differentiable alternative as demonstrated in the experiments with Backward Pass Differentiable Approximation (BPDA) attack. Finally, we experimentally show that the proposed Class-specific Image Inpainting Defence (CIIDefence) is able to withstand several powerful adversarial attacks including the BPDA. The obtained results are consistently better compared to the other recent defence approaches.

Publication forum classification

Field of science, Statistics Finland