v1v2 (latest)

Generalization of Gibbs and Langevin Monte Carlo Algorithms in the Interpolation Regime

7 October 2025

Main:11 Pages

14 Figures

Bibliography:3 Pages

2 Tables

Appendix:16 Pages

Abstract

This paper provides data-dependent bounds on the expected error of the Gibbs algorithm in the overparameterized interpolation regime, where low training errors are also obtained for impossible data, such as random labels in classification. The results show that generalization in the low-temperature regime is already signaled by small training errors in the noisier high-temperature regime. The bounds are stable under approximation with Langevin Monte Carlo algorithms. The analysis motivates the design of an algorithm to compute bounds, which on the MNIST and CIFAR-10 datasets yield nontrivial, close predictions on the test error for true labeled data, while maintaining a correct upper bound on the test error for random labels.

View on arXiv

Comments on this paper