ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.11973
  4. Cited By
Adaptive Training Distributions with Scalable Online Bilevel
  Optimization

Adaptive Training Distributions with Scalable Online Bilevel Optimization

20 November 2023
David Grangier
Pierre Ablin
Awni Y. Hannun
ArXivPDFHTML

Papers citing "Adaptive Training Distributions with Scalable Online Bilevel Optimization"

8 / 8 papers shown
Title
Data Selection via Optimal Control for Language Models
Data Selection via Optimal Control for Language Models
Yuxian Gu
Li Dong
Hongning Wang
Y. Hao
Qingxiu Dong
Furu Wei
Minlie Huang
AI4CE
50
4
0
09 Oct 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
36
3
0
30 Sep 2024
Careful with that Scalpel: Improving Gradient Surgery with an EMA
Careful with that Scalpel: Improving Gradient Surgery with an EMA
Yu-Guan Hsieh
James Thornton
Eugène Ndiaye
Michal Klein
Marco Cuturi
Pierre Ablin
MedIm
31
0
0
05 Feb 2024
A framework for bilevel optimization that enables stochastic and global
  variance reduction algorithms
A framework for bilevel optimization that enables stochastic and global variance reduction algorithms
Mathieu Dagréou
Pierre Ablin
Samuel Vaiter
Thomas Moreau
131
95
0
31 Jan 2022
Amortized Implicit Differentiation for Stochastic Bilevel Optimization
Amortized Implicit Differentiation for Stochastic Bilevel Optimization
Michael Arbel
Julien Mairal
103
58
0
29 Nov 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
250
1,986
0
31 Dec 2020
How Good is Your Tokenizer? On the Monolingual Performance of
  Multilingual Language Models
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
69
235
0
31 Dec 2020
Bilevel Programming for Hyperparameter Optimization and Meta-Learning
Bilevel Programming for Hyperparameter Optimization and Meta-Learning
Luca Franceschi
P. Frasconi
Saverio Salzo
Riccardo Grazzi
Massimiliano Pontil
99
716
0
13 Jun 2018
1