ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.11592
57
7
v1v2 (latest)

How Many Data Are Needed for Robust Learning?

23 February 2022
Yihan Wu
Heng Huang
Hongyang R. Zhang
    OOD
ArXiv (abs)PDFHTML
Abstract

We show that the sample complexity of robust interpolation problem could be exponential in the input dimensionality and discover a phase transition phenomenon when the data are in a unit ball. Robust interpolation refers to the problem of interpolating nnn noisy training data in Rd\R^dRd by a Lipschitz function. Although this problem has been well understood when the covariates are drawn from an isoperimetry distribution, much remains unknown concerning its performance under generic or even the worst-case distributions. Our results are two-fold: 1) too many data hurt robustness; we provide a tight and universal Lipschitzness lower bound Ω(n1/d)\Omega(n^{1/d})Ω(n1/d) of the interpolating function for arbitrary data distributions. Our result disproves potential existence of an O(1)\mathcal{O}(1)O(1)-Lipschitz function in the overparametrization scenario when n=exp⁡(ω(d))n=\exp(\omega(d))n=exp(ω(d)). 2) Small data hurt robustness: n=exp⁡(Ω(d))n=\exp(\Omega(d))n=exp(Ω(d)) is necessary for obtaining a good population error under certain distributions by any O(1)\mathcal{O}(1)O(1)-Lipschitz learning algorithm. Perhaps surprisingly, our results shed light on the curse of big data and the blessing of dimensionality for robustness, and discover an intriguing phenomenon of phase transition at n=exp⁡(Θ(d))n=\exp(\Theta(d))n=exp(Θ(d)).

View on arXiv
Comments on this paper