On Adversarial Robustness: A Neural Architecture Search perspectiveChaitanya Devaguptapu,
and Vineeth N BalasubramanianIn Workshops on Neural Architecture Search, Responsible AI, From Shallow to Deep: Overcoming Limited and Adverse Data (S2D-OLAD) and Robust and Reliable ML in real world at ICLR
Adversarial robustness of deep learning models has gained much traction in the last few years. Various attacks and defenses are proposed to improve the adversarial robustness of modern-day deep learning architectures. While all these approaches help improve the robustness, one promising direction for improving adversarial robustness is un-explored, i.e., the complex topology of the neural network architecture. In this work, we answer the following question: "Can the complex topology of a neural network give adversarial robustness without any form of adversarial training?" empirically by experimenting with different hand-crafted and NAS based architectures. Our findings show that, for small-scale attacks, NAS-based architectures are more robust for small-scale datasets and simple tasks than hand-crafted architectures. However, as the dataset’s size or the task’s complexity increase, hand-crafted architectures are more robust than NAS-based architectures. We perform the first large scale study to understand adversarial robustness purely from an architectural perspective. Our results show that random sampling in the search space of DARTS (a popular NAS method) with simple ensembling can improve the robustness to PGD attack by nearly 12%. We show that NAS, which is popular for SoTA accuracy, can provide adversarial accuracy as a free add-on without any form of adversarial training. Our results show that leveraging the power of neural network topology with methods like ensembles can be an excellent way to achieve adversarial robustness without any form of adversarial training. We also introduce a metric that can be used to calculate the trade-off between clean accuracy and adversarial robustness.
Active Learning (AL) techniques aim to minimize the training data required to train a model for a given task. Pool-based AL techniques start with a small initial labeled pool and then iteratively pick batches of most informative samples for labeling. Generally, the initial pool is sampled randomly and labeled to seed the AL iterations. While recent‘ studies have focused on evaluating the robustness of various query functions in AL, little to no attention has been given to the design of initial labeled pool. Given the recent successes of learning representations in self-supervised/unsupervised ways, we propose to study if an \textitintelligently sampled initial labeled pool can improve deep AL performance. We will investigate the effect of intelligently sampled initial labeled pools, including the use of self-supervised and unsupervised strategies, on deep AL methods. We describe our experimental details, implementation details, datasets, performance metrics as well as planned ablation studies in this proposal. If intelligently sampled initial pools improve AL performance, our work could make a positive contribution to boosting AL performance with no additional annotation, developing datasets with lesser annotation cost in general, and promoting further research in the use of unsupervised learning methods for AL.
Borrow From Anywhere: Pseudo Multi-Modal Object Detection in Thermal ImageryChaitanya Devaguptapu,
Manuj M Sharma,
and Vineeth N BalasubramanianIn The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops
Can we improve detection in the thermal domain by borrowing features from rich domains like visual RGB? In this paper, we propose a ‘pseudo-multimodal’ object detector trained on natural image domain data to help improve the performance of object detection in thermal images. We assume access to a large-scale dataset in the visual RGB domain and relatively smaller dataset (in terms of instances) in the thermal domain, as is common today. We propose the use of well-known image-to-image translation frameworks to generate pseudo-RGB equivalents of a given thermal image and then use a multi-modal architecture for object detection in the thermal image. We show that our framework outperforms existing benchmarks without the explicit need for paired training examples from the two domains. We also show that our framework has the ability to learn with less data from thermal domain when using our approach.