407

Enhanced Bilevel Optimization via Bregman Distance

Neural Information Processing Systems (NeurIPS), 2021
Abstract

Bilevel optimization has been widely applied to many machine learning problems such as hyperparameter optimization, policy optimization and meta learning. Although many bilevel optimization methods recently have been proposed to solve the bilevel optimization problems, they still suffer from high computational complexities and do not consider the more general bilevel problems with nonsmooth regularization. In the paper, thus, we propose a class of enhanced bilevel optimization methods by using Bregman distance to solve bilevel optimization problems, where the outer subproblem is nonconvex and possibly nonsmooth, and the inner subproblem is strongly convex. Specifically, we propose a bilevel optimization method based on Bregman distance (BiO-BreD) for solving deterministic bilevel problems, which reaches a lower computational complexity than the best known results. Meanwhile, we also propose a stochastic bilevel optimization method (SBiO-BreD) to solve stochastic bilevel problems based on stochastic approximated gradients and Bregman distance. Moreover, we further propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique, which achieves a lower computational complexity than the best known computational complexity with respect to condition number κ\kappa and target accuracy ϵ\epsilon for finding an ϵ\epsilon-stationary point. We employ data hyper-cleaning task to demonstrate that our algorithms outperform the existing bilevel algorithms.

View on arXiv
Comments on this paper