24
21

Revisiting the Importance of Amplifying Bias for Debiasing

Abstract

In image classification, "debiasing" aims to train a classifier to be less susceptible to dataset bias, the strong correlation between peripheral attributes of data samples and a target class. For example, even if the frog class in the dataset mainly consists of frog images with a swamp background (i.e., bias-aligned samples), a debiased classifier should be able to correctly classify a frog at a beach (i.e., bias-conflicting samples). Recent debiasing approaches commonly use two components for debiasing, a biased model fBf_B and a debiased model fDf_D. fBf_B is trained to focus on bias-aligned samples (i.e., overfitted to the bias) while fDf_D is mainly trained with bias-conflicting samples by concentrating on samples which fBf_B fails to learn, leading fDf_D to be less susceptible to the dataset bias. While the state-of-the-art debiasing techniques have aimed to better train fDf_D, we focus on training fBf_B, an overlooked component until now. Our empirical analysis reveals that removing the bias-conflicting samples from the training set for fBf_B is important for improving the debiasing performance of fDf_D. This is due to the fact that the bias-conflicting samples work as noisy samples for amplifying the bias for fBf_B since those samples do not include the bias attribute. To this end, we propose a simple yet effective data sample selection method which removes the bias-conflicting samples to construct a bias-amplified dataset for training fBf_B. Our data sample selection method can be directly applied to existing reweighting-based debiasing approaches, obtaining consistent performance boost and achieving the state-of-the-art performance on both synthetic and real-world datasets.

View on arXiv
Comments on this paper