Computer Vision News - October 2020

Understanding Adversarial Attacks on Deep Learning Based Medical Image Analysis Systems 5 This is not anymore a problem of the scientific community alone; on the contrary, it surely affects moral and philosophical matters as well, since decisions and predictions given by these systems can be falsified in such a way that hardly anybody could find out. As previous works already showed that the classification accuracy substantially drops for different tasks in case of adversarial attacks, the aim of this paper is then to shed further light on this problem, by performing and then detecting adversarial attacks in the medical imaging domain. This appears to be the first work of such kind, which comprehensively observes these issues and contributes to the fascinating aim of building explainable and robust deep learning systems for medical diagnosis. In particular, the paper focuses on understanding adversarial attacks on medical imaging and their detections. The authors answer some fundamental questions and then list possible reasons for the findings. Research on medical image analysis is highly influenced by DNN analysis and Deep Learning models can be found in several imaging fields such as segmentation, classification and registration for both therapeutic and diagnostic reasons. The paper presents a thorough explanation of adversarial attacks and detection, of which we report the most essential parts. Adversarial Attacks “An attacking method is to maximize the classification error of the DNN model, whilst keeping it within a small ε-ball centred at the original sample , where is the norm”. Adversarial attacks can be either targeted or untargeted, depending on whether the misclassification is targeted at a specific class or just an arbitrary one. The paper focuses on untargeted attacks in the white box setting (setting using adversarial gradients extracted directly from the target model as opposed to attacking a surrogate model), under the “L∞” perturbation constraint. Four adversarial attacks are analysed: Fast Gradient Sign Method (FSGM); Basic Iterative Method (BIM); Projected Gradient Descent (PGD) , regarded as the strongest first order attack; Carlini and Wagner (CW) Attack , which is the state- of-the-art optimization-based attack. The pipeline of the adversarial attacks led in these experiments is shown in the image below: “the input gradient extractor feeds the image into the pre-trained DNN classifier to obtain the input gradients, based upon which the image is perturbed to maximize the network’s loss to the correct class”. (|| − || ≤ ) (|| − || ≤ ) || ∙ || Ǧ