Balanced Product of Calibrated Experts for Long-Tailed Recognition
Many real-world recognition problems are characterized by long-tailed label distributions. These distributions make representation learning highly challenging due to limited generalization over the tail classes. If the test distribution differs from the training distribution, e.g. uniform versus long-tailed, the problem of the distribution-shift needs to be addressed. A recent line of work proposes learning multiple diverse experts to tackle this issue. Ensemble diversity is encouraged by various techniques, e.g. by specializing different experts on the head and the tail classes. In this work, we take an analytical approach, and extend the notion of logit adjustment to ensembles to form a Balanced Product of Experts (BalPoE). BalPoE generalizes several previous approaches, and combines a family of experts with different test-time target distributions. We show how to properly define these distributions and combine the experts in order to achieve unbiased predictions, by proving that the ensemble is Fisher-consistent for minimizing the balanced error. Our theoretical analysis shows that our balanced ensemble requires calibrated experts, which we achieve in practice using mixup. We conduct extensive experiments and our method obtains new state-of-the-art results on three long-tailed datasets: CIFAR-100-LT, ImageNet-LT and iNaturalist-2018. Our code will be released upon paper acceptance.
View on arXiv