Performance Limits of Online Stochastic Sub-Gradient Learning
This work examines the performance of stochastic sub-gradient learning strategies under weaker conditions than usually considered in the literature. The conditions are shown to be automatically satisfied by several important cases of interest including the construction of Linear-SVM, LASSO, and Total-Variation denoising formulations. In comparison, these problems do not satisfy the traditional assumptions automatically and, therefore, conclusions derived based on these earlier assumptions are not directly applicable to these problems. The analysis establishes that stochastic sub-gradient strategies can attain exponential convergence rates, as opposed to sub-linear rates, to the steady-state. A realizable exponential-weighting procedure is proposed to smooth the intermediate iterates by the sub-gradient procedure and to guarantee the established performance bounds in terms of convergence rate and excessive risk performance. Both single-agent and multi-agent scenarios are studied, where the latter case assumes that a collection of agents are interconnected by a topology and can only interact locally with their neighbors. The theoretical conclusions are illustrated by several examples and simulations, including comparisons with the FISTA procedure.
View on arXiv