Thompson Sampling for Contextual Bandits with Linear Payoffs

15 September 2012

Papers citing "Thompson Sampling for Contextual Bandits with Linear Payoffs"

19 / 19 papers shown

Title
Prompt Optimization with Logged Bandit Data Haruka Kiyohara Daniel Yiming Cao Yuta Saito Thorsten Joachims 123 0 0 03 Apr 2025
Linear Bandits with Partially Observable Features Wonyoung Hedge Kim Sungwoo Park G. Iyengar A. Zeevi Min Hwan Oh 113 1 0 10 Feb 2025
Distributed Thompson sampling under constrained communication Saba Zerefa Zhaolin Ren Haitong Ma Na Li 71 1 0 03 Jan 2025
AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games Kefan Su Yusen Huo Zhilin Zhang Shuai Dou Chuan Yu Jian Xu Zongqing Lu Bo Zheng 101 7 0 31 Dec 2024
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem Nima Akbarzadeh Erick Delage Yossiri Adulyasak 91 0 0 30 Oct 2024
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit Junyu Cao Ruijiang Gao Esmaeil Keyvanshokooh 102 1 0 18 Oct 2024
Second Order Bounds for Contextual Bandits with Function Approximation Aldo Pacchiano 142 4 0 24 Sep 2024
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback Arun Verma Zhongxiang Dai Xiaoqiang Lin Patrick Jaillet K. H. Low 84 5 0 24 Jul 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF Akhil Agnihotri Rahul Jain Deepak Ramachandran Zheng Wen OffRL 97 2 0 13 Jun 2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off Itai Shufaro Nadav Merlis Nir Weinberger Shie Mannor 94 0 0 26 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces Imad Aouali Victor-Emmanuel Brunel David Rohde Anna Korba OffRL 78 5 0 22 Feb 2024
Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits Nicolas Nguyen Imad Aouali András Gyorgy Claire Vernade 56 2 0 08 Feb 2024
Ensemble sampling for linear bandits: small ensembles suffice David Janz A. Litvak Csaba Szepesvári 52 1 0 14 Nov 2023
Selective Uncertainty Propagation in Offline RL Sanath Kumar Krishnamurthy Shrey Modi Tanmay Gangwani S. Katariya Branislav Kveton A. Rangi OffRL 113 0 0 01 Feb 2023
Safe Linear Thompson Sampling with Side Information Ahmadreza Moradipari Sanae Amani M. Alizadeh Christos Thrampoulidis 76 42 0 06 Nov 2019
Learning to Optimize Via Posterior Sampling Daniel Russo Benjamin Van Roy 114 697 0 11 Jan 2013
Further Optimal Regret Bounds for Thompson Sampling Shipra Agrawal Navin Goyal 64 443 0 15 Sep 2012
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis E. Kaufmann N. Korda Rémi Munos 79 585 0 18 May 2012
Towards minimax policies for online linear optimization with bandit feedback Sébastien Bubeck Nicolò Cesa-Bianchi Sham Kakade OffRL 109 149 0 14 Feb 2012