Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.16745
Cited By
Bandits with Preference Feedback: A Stackelberg Game Perspective
24 June 2024
Barna Pásztor
Parnian Kassraie
Andreas Krause
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Bandits with Preference Feedback: A Stackelberg Game Perspective"
3 / 3 papers shown
Title
Sample Efficient Preference Alignment in LLMs via Active Exploration
Viraj Mehta
Vikramjeet Das
Ojash Neopane
Yijia Dai
Ilija Bogunovic
Ilija Bogunovic
W. Neiswanger
Stefano Ermon
Jeff Schneider
Willie Neiswanger
OffRL
25
12
0
01 Dec 2023
A framework for bilevel optimization that enables stochastic and global variance reduction algorithms
Mathieu Dagréou
Pierre Ablin
Samuel Vaiter
Thomas Moreau
129
95
0
31 Jan 2022
Preference-Based Learning for Exoskeleton Gait Optimization
Maegan Tucker
Ellen R. Novoseller
Claudia K. Kann
Yanan Sui
Yisong Yue
J. W. Burdick
Aaron D. Ames
66
90
0
26 Sep 2019
1