ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.10758
16
4

Lower Generalization Bounds for GD and SGD in Smooth Stochastic Convex Optimization

19 March 2023
Peiyuan Zhang
Jiaye Teng
J. Zhang
ArXivPDFHTML
Abstract

This work studies the generalization error of gradient methods. More specifically, we focus on how training steps TTT and step-size η\etaη might affect generalization in smooth stochastic convex optimization (SCO) problems. We first provide tight excess risk lower bounds for Gradient Descent (GD) and Stochastic Gradient Descent (SGD) under the general non-realizable smooth SCO setting, suggesting that existing stability analyses are tight in step-size and iteration dependence, and that overfitting provably happens. Next, we study the case when the loss is realizable, i.e. an optimal solution minimizes all the data points. Recent works show better rates can be attained but the improvement is reduced when training time is long. Our paper examines this observation by providing excess risk lower bounds for GD and SGD in two realizable settings: 1) ηT=\bigOn\eta T = \bigO{n}ηT=\bigOn, and (2) ηT=\bigOmegan\eta T = \bigOmega{n}ηT=\bigOmegan, where nnn is the size of dataset. In the first case ηT=\bigOmegan\eta T = \bigOmega{n}ηT=\bigOmegan, our lower bounds tightly match and certify the respective upper bounds. However, for the case ηT=\bigOmegan\eta T = \bigOmega{n}ηT=\bigOmegan, our analysis indicates a gap between the lower and upper bounds. A conjecture is proposed that the gap can be closed by improving upper bounds, supported by analyses in two special scenarios.

View on arXiv
Comments on this paper