ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.06694
  4. Cited By
Versatile Dueling Bandits: Best-of-both-World Analyses for Online
  Learning from Preferences

Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences

14 February 2022
Aadirupa Saha
Pierre Gaillard
ArXivPDFHTML

Papers citing "Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences"

5 / 5 papers shown
Title
Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents
Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents
Fanzeng Xia
Hao Liu
Yisong Yue
Tongxin Li
61
1
0
03 Jan 2025
Optimal Design for Human Feedback
Optimal Design for Human Feedback
Subhojyoti Mukherjee
Anusha Lalitha
Kousha Kalantari
Aniket Deshmukh
Ge Liu
Yifei Ma
B. Kveton
36
0
0
22 Apr 2024
DP-Dueling: Learning from Preference Feedback without Compromising User
  Privacy
DP-Dueling: Learning from Preference Feedback without Compromising User Privacy
Aadirupa Saha
Hilal Asi
36
1
0
22 Mar 2024
One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret
  Guarantees in Sleeping Bandits
One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits
Pierre Gaillard
Aadirupa Saha
Soham Dan
16
3
0
26 Oct 2022
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive
  Non-Stationary Dueling Bandits
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits
Thomas Kleine Buening
Aadirupa Saha
38
6
0
25 Oct 2022
1