CIIDefence: Defeating Adversarial Attacks by Fusing Class-specific Image Inpainting and Image Denoising.
Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Scientific › peer-review
|Title of host publication||2019 International Conference on Computer Vision, ICCV 2019|
|Publication status||Published - 2019|
|Publication type||A4 Article in a conference publication|
|Event||IEEE/CVF International Conference on Computer Vision - |
Duration: 27 Oct 2019 → 2 Nov 2019
|Name||Proceedings of the IEEE International Conference on Computer Vision|
|Conference||IEEE/CVF International Conference on Computer Vision|
|Period||27/10/19 → 2/11/19|
This paper presents a novel approach for protecting deep neural networks from adversarial attacks, i.e., methods that add well-crafted imperceptible modifications to the original inputs such that they are incorrectly classified with high confidence. The proposed defence mechanism is inspired by the recent works mitigating the adversarial disturbances by the means of image reconstruction and denoising. However, unlike the previous works, we apply the reconstruction only for small and carefully selected image areas that are most influential to the current classification outcome. The selection process is guided by the class activation map responses obtained for multiple top-ranking class labels. The same regions are also the most prominent for the adversarial perturbations and hence most important to purify. The resulting inpainting task is substantially more tractable than the full image reconstruction, while still being able to prevent the adversarial attacks. Furthermore, we combine the selective image inpainting with wavelet based image denoising to produce a non differentiable layer that prevents attacker from using gradient backpropagation. Moreover, the proposed nonlinearity cannot be easily approximated with simple differentiable alternative as demonstrated in the experiments with Backward Pass Differentiable Approximation (BPDA) attack. Finally, we experimentally show that the proposed Class-specific Image Inpainting Defence (CIIDefence) is able to withstand several powerful adversarial attacks including the BPDA. The obtained results are consistently better compared to the other recent defence approaches.