Normalized Gradients for All

Abstract
In this short note, I show how to adapt to H\"{o}lder smoothness using normalized gradients in a black-box way. Moreover, the bound will depend on a novel notion of local H\"{o}lder smoothness. The main idea directly comes from Levy [2017].
View on arXivComments on this paper