Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.06321
Cited By
Sequential Batch Learning in Finite-Action Linear Contextual Bandits
14 April 2020
Yanjun Han
Zhengqing Zhou
Zhengyuan Zhou
Jose H. Blanchet
Peter Glynn
Yinyu Ye
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sequential Batch Learning in Finite-Action Linear Contextual Bandits"
23 / 23 papers shown
Title
Batched Stochastic Bandit for Nondegenerate Functions
Yu Liu
Yunlu Shu
Tianyu Wang
110
0
0
09 May 2024
Generalized Linear Bandits with Limited Adaptivity
Ayush Sawarni
Nirjhar Das
Siddharth Barman
Gaurav Sinha
103
3
0
10 Apr 2024
IBCB: Efficient Inverse Batched Contextual Bandit for Behavioral Evolution History
Yi Xu
Weiran Shen
Xiao Zhang
Jun Xu
OffRL
136
0
0
24 Mar 2024
Introduction to Multi-Armed Bandits
Aleksandrs Slivkins
350
999
0
15 Apr 2019
Batched Multi-armed Bandits Problem
Zijun Gao
Yanjun Han
Zhimei Ren
Zhengqing Zhou
111
140
0
03 Apr 2019
Confounding-Robust Policy Improvement
Nathan Kallus
Angela Zhou
CML
OffRL
186
152
0
22 May 2018
Estimation Considerations in Contextual Bandits
Maria Dimakopoulou
Zhengyuan Zhou
Susan Athey
Guido Imbens
126
69
0
19 Nov 2017
Scalable Generalized Linear Bandits: Online Computation and Hashing
Kwang-Sung Jun
Aniruddha Bhargava
Robert D. Nowak
Rebecca Willett
61
124
0
01 Jun 2017
Provably Optimal Algorithms for Generalized Linear Contextual Bandits
Lihong Li
Yu Lu
Dengyong Zhou
89
94
0
28 Feb 2017
Policy Learning with Observational Data
Susan Athey
Stefan Wager
CML
OffRL
269
183
0
09 Feb 2017
Asymptotic Convergence in Online Learning with Unbounded Delays
Scott Garrabrant
N. Soares
Jessica Taylor
21
10
0
18 Apr 2016
BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits
Alexander Rakhlin
Karthik Sridharan
OffRL
216
72
0
06 Feb 2016
Batched bandit problems
Vianney Perchet
Philippe Rigollet
Sylvain Chassang
E. Snowberg
OffRL
119
200
0
02 May 2015
An Information-Theoretic Analysis of Thompson Sampling
Daniel Russo
Benjamin Van Roy
106
423
0
21 Mar 2014
Online Learning under Delayed Feedback
Pooria Joulani
András Gyorgy
Csaba Szepesvári
69
278
0
04 Jun 2013
Learning to Optimize Via Posterior Sampling
Daniel Russo
Benjamin Van Roy
158
699
0
11 Jan 2013
Further Optimal Regret Bounds for Thompson Sampling
Shipra Agrawal
Navin Goyal
95
443
0
15 Sep 2012
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
154
993
0
15 Sep 2012
Efficient Optimal Learning for Contextual Bandits
Miroslav Dudík
Daniel J. Hsu
Satyen Kale
Nikos Karampatziakis
John Langford
L. Reyzin
Tong Zhang
146
300
0
13 Jun 2011
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
203
694
0
23 Mar 2011
Nonparametric Bandits with Covariates
Philippe Rigollet
A. Zeevi
172
109
0
08 Mar 2010
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
325
2,935
0
28 Feb 2010
Linearly Parameterized Bandits
Paat Rusmevichientong
J. Tsitsiklis
251
558
0
18 Dec 2008
1