1
Adversarially robust image classifier learning for real-world tasks : framework and empirical evaluation | |
Author | Kantapon Pornprasertsakul |
Subject(s) | Machine learning Neural networks (Computer science) Image processing |
Note | A thesis submitted in partial fulfillment of the requirements for the degree of Master of Engineering in Computer Science |
Publisher | Asian Institute of Technology |
Abstract | Among the most prominent problems solved by machine learning (ML) techniques is that of image classification, and the best performing ML technique for solving this problem is the application of convolutional neural networks (CNNs). However, state-of-the-art CNNs are vulnerable to small adversarially-created perturbations. In response, many researchers have used the following technique to add a defense mechanism to their classification mod els. Given a classifier built based on a training set, they retrain it using the original training data augmented with adversarially-created perturbations. We refer to the resulting models as adversarially-trained classification models. There are two types of adversarially-trained classification models, depending on the characteristics of the adversarial perturbation gener ator (attacker). The attacker can be either a fixed algorithm (fixed attacker), or it can evolve based on the training data it is exposed to (adaptive attacker). A natural hypothesis is that adversarially-trained classification models born of adaptive attacks would be stronger than those born of fixed ones. However, we find that models resulting from adversarial train ing with both types of adversarially-trained models offer significant improvements over the original classification model but generally, the improvement is only against attacks by the same algorithm used during adversarial training. To conquer this weakness, we propose an end-to-end training framework that subjects the model being trained to multiple attackers, so that the resulting model is robust against different types of attacks. We find that classifiers trained under the framework can be adapted to be robust against adaptive attackers but it is more difficult to obtain robustness against fixed whitebox adversaries at the same time. To address this issue, we propose several regularization techniques such as weight clipping to improve classifier robustness against both types of adversaries, but we only obtain slight improvements. Our next step is to identify a suitable techniques to improve classification models’ ability to learn from both fixed and adaptive adversaries during training. |
Year | 2020 |
Type | Thesis |
School | School of Engineering and Technology (SET) |
Department | Department of Information and Communications Technologies (DICT) |
Academic Program/FoS | Computer Science (CS) |
Chairperson(s) | Dailey, Matthew N.; |
Examination Committee(s) | Chanathip Namprempre;Mongkol Ekpanyapong; |
Scholarship Donor(s) | His Majesty the King’s Scholarships (Thailand); |
Degree | Thesis (M. Eng.) - Asian Institute of Technology, 2020 |