Demystifying How Self-Supervised Features Improve Training from Noisy Labels

18 October 2021

Hao Cheng

Zhaowei Zhu

Xing Sun

Yang Liu

NoLa

ArXiv (abs)PDF HTML Github (11★)

Abstract

The advancement of self-supervised learning (SSL) motivates researchers to apply SSL on other tasks such as learning with noisy labels. Recent literature indicates that methods built on SSL features can substantially improve the performance of learning with noisy labels. Nonetheless, the deeper reasons why (and how) SSL features benefit the training from noisy labels are less understood. In this paper, we study why and how self-supervised features help networks resist label noise using both theoretical analyses and numerical experiments. Our results explain when and why fixing the SSL encoder helps converge to a better optimum, and why an unfixed encoder is unstable but tends to achieve a better best-epoch accuracy in more challenging noise settings. Further, we provide insights for how knowledge distilled from SSL features can compromise between a fixed encoder and an unfixed encoder. We hope our work provides a better understanding for learning with noisy labels from the perspective of self-supervised learning and can potentially serve as a guideline for further research. Code is available at github.com/UCSC-REAL/SelfSup_NoisyLabel.

View on arXiv

Comments on this paper