ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00712
21
11

On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis

2 January 2023
Le‐Yu Chen
Jing Xu
J. Zhang
ArXivPDFHTML
Abstract

Bilevel optimization reveals the inner structure of otherwise oblique optimization problems, such as hyperparameter tuning, neural architecture search, and meta-learning. A common goal in bilevel optimization is to minimize a hyper-objective that implicitly depends on the solution set of the lower-level function. Although this hyper-objective approach is widely used, its theoretical properties have not been thoroughly investigated in cases where the lower-level functions lack strong convexity. In this work, we first provide hardness results to show that the goal of finding stationary points of the hyper-objective for nonconvex-convex bilevel optimization can be intractable for zero-respecting algorithms. Then we study a class of tractable nonconvex-nonconvex bilevel problems when the lower-level function satisfies the Polyak-{\L}ojasiewicz (PL) condition. We show a simple first-order algorithm can achieve better complexity bounds of O~(ϵ−2)\tilde{\mathcal{O}}(\epsilon^{-2})O~(ϵ−2), O~(ϵ−4)\tilde{\mathcal{O}}(\epsilon^{-4})O~(ϵ−4) and O~(ϵ−6)\tilde{\mathcal{O}}(\epsilon^{-6})O~(ϵ−6) in the deterministic, partially stochastic, and fully stochastic setting respectively.

View on arXiv
Comments on this paper