Basic Inequalities for First-Order Optimization with Applications to Statistical Risk Analysis

31 December 2025

Seunghoon Paik

Kangjie Zhou

Matus Telgarsky

Ryan J. Tibshirani

ArXiv (abs)PDF HTML Github

Main:24 Pages

6 Figures

Bibliography:4 Pages

4 Tables

Appendix:19 Pages

Abstract

We introduce \textit{basic inequalities} for first-order iterative optimization algorithms, forming a simple and versatile framework that connects implicit and explicit regularization. While related inequalities appear in the literature, we isolate and highlight a specific form and develop it as a well-rounded tool for statistical analysis. Let $f$ denote the objective function to be optimized. Given a first-order iterative algorithm initialized at $\theta_0$ with current iterate $\theta_T$ , the basic inequality upper bounds $f(\theta_T)-f(z)$ for any reference point $z$ in terms of the accumulated step sizes and the distances between $\theta_0$ , $\theta_T$ , and $z$ . The bound translates the number of iterations into an effective regularization coefficient in the loss function. We demonstrate this framework through analyses of training dynamics and prediction risk bounds. In addition to revisiting and refining known results on gradient descent, we provide new results for mirror descent with Bregman divergence projection, for generalized linear models trained by gradient descent and exponentiated gradient descent, and for randomized predictors. We illustrate and supplement these theoretical findings with experiments on generalized linear models.

View on arXiv

Comments on this paper