Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.01887
Cited By
Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents
3 January 2025
Fanzeng Xia
Hao Liu
Yisong Yue
Tongxin Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents"
10 / 10 papers shown
Title
Multi-Player Approaches for Dueling Bandits
Or Raveh
Junya Honda
Masashi Sugiyama
22
1
0
25 May 2024
Do LLM Agents Have Regret? A Case Study in Online Learning and Games
Chanwoo Park
Xiangyu Liu
Asuman Ozdaglar
Kaiqing Zhang
69
15
0
25 Mar 2024
Can large language models explore in-context?
Akshay Krishnamurthy
Keegan Harris
Dylan J. Foster
Cyril Zhang
Aleksandrs Slivkins
LM&Ro
LLMAG
LRM
110
17
0
22 Mar 2024
Large Language Models to Enhance Bayesian Optimization
Tennison Liu
Nicolás Astorga
Nabeel Seedat
M. Schaar
58
44
0
06 Feb 2024
LLMs-augmented Contextual Bandit
Ali Baheri
Cecilia Ovesdotter Alm
OffRL
26
1
0
03 Nov 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Preference-Based Learning for Exoskeleton Gait Optimization
Maegan Tucker
Ellen R. Novoseller
Claudia K. Kann
Yanan Sui
Yisong Yue
J. W. Burdick
Aaron D. Ames
58
89
0
26 Sep 2019
1