Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1209.3352
Cited By
Thompson Sampling for Contextual Bandits with Linear Payoffs
15 September 2012
Shipra Agrawal
Navin Goyal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Thompson Sampling for Contextual Bandits with Linear Payoffs"
19 / 19 papers shown
Title
Prompt Optimization with Logged Bandit Data
Haruka Kiyohara
Daniel Yiming Cao
Yuta Saito
Thorsten Joachims
123
0
0
03 Apr 2025
Linear Bandits with Partially Observable Features
Wonyoung Hedge Kim
Sungwoo Park
G. Iyengar
A. Zeevi
Min Hwan Oh
113
1
0
10 Feb 2025
Distributed Thompson sampling under constrained communication
Saba Zerefa
Zhaolin Ren
Haitong Ma
Na Li
71
1
0
03 Jan 2025
AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games
Kefan Su
Yusen Huo
Zhilin Zhang
Shuai Dou
Chuan Yu
Jian Xu
Zongqing Lu
Bo Zheng
101
7
0
31 Dec 2024
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem
Nima Akbarzadeh
Erick Delage
Yossiri Adulyasak
91
0
0
30 Oct 2024
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit
Junyu Cao
Ruijiang Gao
Esmaeil Keyvanshokooh
102
1
0
18 Oct 2024
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
142
4
0
24 Sep 2024
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback
Arun Verma
Zhongxiang Dai
Xiaoqiang Lin
Patrick Jaillet
K. H. Low
84
5
0
24 Jul 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
97
2
0
13 Jun 2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off
Itai Shufaro
Nadav Merlis
Nir Weinberger
Shie Mannor
94
0
0
26 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
78
5
0
22 Feb 2024
Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
Nicolas Nguyen
Imad Aouali
András Gyorgy
Claire Vernade
56
2
0
08 Feb 2024
Ensemble sampling for linear bandits: small ensembles suffice
David Janz
A. Litvak
Csaba Szepesvári
52
1
0
14 Nov 2023
Selective Uncertainty Propagation in Offline RL
Sanath Kumar Krishnamurthy
Shrey Modi
Tanmay Gangwani
S. Katariya
Branislav Kveton
A. Rangi
OffRL
113
0
0
01 Feb 2023
Safe Linear Thompson Sampling with Side Information
Ahmadreza Moradipari
Sanae Amani
M. Alizadeh
Christos Thrampoulidis
76
42
0
06 Nov 2019
Learning to Optimize Via Posterior Sampling
Daniel Russo
Benjamin Van Roy
114
697
0
11 Jan 2013
Further Optimal Regret Bounds for Thompson Sampling
Shipra Agrawal
Navin Goyal
64
443
0
15 Sep 2012
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
E. Kaufmann
N. Korda
Rémi Munos
79
585
0
18 May 2012
Towards minimax policies for online linear optimization with bandit feedback
Sébastien Bubeck
Nicolò Cesa-Bianchi
Sham Kakade
OffRL
109
149
0
14 Feb 2012
1