ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.06246
  4. Cited By
Corralling a Band of Bandit Algorithms
v1v2v3 (latest)

Corralling a Band of Bandit Algorithms

Annual Conference Computational Learning Theory (COLT), 2016
19 December 2016
Alekh Agarwal
Haipeng Luo
Behnam Neyshabur
Robert Schapire
ArXiv (abs)PDFHTML

Papers citing "Corralling a Band of Bandit Algorithms"

50 / 121 papers shown
Title
Damped Online Newton Step for Portfolio Selection
Damped Online Newton Step for Portfolio SelectionAnnual Conference Computational Learning Theory (COLT), 2022
Zakaria Mhammedi
Alexander Rakhlin
110
16
0
15 Feb 2022
Corralling a Larger Band of Bandits: A Case Study on Switching Regret
  for Linear Bandits
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear BanditsAnnual Conference Computational Learning Theory (COLT), 2022
Haipeng Luo
Mengxiao Zhang
Peng Zhao
Zhi Zhou
198
20
0
12 Feb 2022
Adaptive Bandit Convex Optimization with Heterogeneous Curvature
Adaptive Bandit Convex Optimization with Heterogeneous CurvatureAnnual Conference Computational Learning Theory (COLT), 2022
Haipeng Luo
Mengxiao Zhang
Penghui Zhao
196
5
0
12 Feb 2022
Model Selection in Batch Policy Optimization
Model Selection in Batch Policy OptimizationInternational Conference on Machine Learning (ICML), 2021
Jonathan Lee
George Tucker
Ofir Nachum
Bo Dai
OffRL
200
12
0
23 Dec 2021
Uncoupled Bandit Learning towards Rationalizability: Benchmarks,
  Barriers, and Algorithms
Uncoupled Bandit Learning towards Rationalizability: Benchmarks, Barriers, and Algorithms
Jibang Wu
Haifeng Xu
Fan Yao
272
0
0
10 Nov 2021
Misspecified Gaussian Process Bandit Optimization
Misspecified Gaussian Process Bandit OptimizationNeural Information Processing Systems (NeurIPS), 2021
Ilija Bogunovic
Andreas Krause
181
53
0
09 Nov 2021
Universal and data-adaptive algorithms for model selection in linear
  contextual bandits
Universal and data-adaptive algorithms for model selection in linear contextual bandits
Vidya Muthukumar
A. Krishnamurthy
223
5
0
08 Nov 2021
Decentralized Cooperative Reinforcement Learning with Hierarchical
  Information Structure
Decentralized Cooperative Reinforcement Learning with Hierarchical Information StructureInternational Conference on Algorithmic Learning Theory (ALT), 2021
Hsu Kao
Chen-Yu Wei
V. Subramanian
299
15
0
01 Nov 2021
Minimax Optimal Quantile and Semi-Adversarial Regret via
  Root-Logarithmic Regularizers
Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic RegularizersNeural Information Processing Systems (NeurIPS), 2021
Jeffrey Negrea
Blair Bilodeau
Nicolò Campolongo
Francesco Orabona
Daniel M. Roy
239
9
0
27 Oct 2021
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays
Jiatai Huang
Yan Dai
Longbo Huang
AI4CE
333
2
0
26 Oct 2021
The Pareto Frontier of model selection for general Contextual Bandits
The Pareto Frontier of model selection for general Contextual Bandits
T. V. Marinov
Julian Zimmert
211
25
0
25 Oct 2021
Linear Contextual Bandits with Adversarial Corruptions
Linear Contextual Bandits with Adversarial Corruptions
Heyang Zhao
Dongruo Zhou
Quanquan Gu
AAML
207
24
0
25 Oct 2021
Model Selection for Generic Reinforcement Learning
Model Selection for Generic Reinforcement Learning
Avishek Ghosh
Sayak Ray Chowdhury
Kannan Ramchandran
166
1
0
13 Jul 2021
Adapting to Misspecification in Contextual Bandits
Adapting to Misspecification in Contextual Bandits
Dylan J. Foster
Claudio Gentile
M. Mohri
Julian Zimmert
189
95
0
12 Jul 2021
Model Selection for Generic Contextual Bandits
Model Selection for Generic Contextual BanditsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2021
Avishek Ghosh
Abishek Sankararaman
Kannan Ramchandran
256
7
0
07 Jul 2021
On component interactions in two-stage recommender systems
On component interactions in two-stage recommender systemsNeural Information Processing Systems (NeurIPS), 2021
Jiri Hron
K. Krauth
Sai Li
Niki Kilbertus
CMLLRM
180
35
0
28 Jun 2021
Provably Efficient Representation Selection in Low-rank Markov Decision
  Processes: From Online to Offline RL
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RLConference on Uncertainty in Artificial Intelligence (UAI), 2021
Weitong Zhang
Jiafan He
Dongruo Zhou
Amy Zhang
Quanquan Gu
OffRL
235
12
0
22 Jun 2021
Towards Costless Model Selection in Contextual Bandits: A Bias-Variance
  Perspective
Towards Costless Model Selection in Contextual Bandits: A Bias-Variance PerspectiveInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Sanath Kumar Krishnamurthy
Adrienne Margaret Propp
Susan Athey
242
3
0
11 Jun 2021
Thompson Sampling with a Mixture Prior
Thompson Sampling with a Mixture PriorInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Joey Hong
Branislav Kveton
Manzil Zaheer
Mohammad Ghavamzadeh
Craig Boutilier
249
14
0
10 Jun 2021
Feature and Parameter Selection in Stochastic Linear Bandits
Feature and Parameter Selection in Stochastic Linear BanditsInternational Conference on Machine Learning (ICML), 2021
Ahmadreza Moradipari
Berkay Turan
Yasin Abbasi-Yadkori
M. Alizadeh
Mohammad Ghavamzadeh
363
6
0
09 Jun 2021
Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in
  Contextual Bandit Algorithms
Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit AlgorithmsNeural Information Processing Systems (NeurIPS), 2021
Qin Ding
Yue Kang
Yi-Wei Liu
Thomas C. M. Lee
Cho-Jui Hsieh
James Sharpnack
235
12
0
05 Jun 2021
Multi-armed Bandit Algorithms on System-on-Chip: Go Frequentist or
  Bayesian?
Multi-armed Bandit Algorithms on System-on-Chip: Go Frequentist or Bayesian?
S. Santosh
S. Darak
163
0
0
05 Jun 2021
Leveraging Good Representations in Linear Contextual Bandits
Leveraging Good Representations in Linear Contextual BanditsInternational Conference on Machine Learning (ICML), 2021
Matteo Papini
Andrea Tirinzoni
Marcello Restelli
A. Lazaric
Matteo Pirotta
164
31
0
08 Apr 2021
A Simple Approach for Non-stationary Linear Bandits
A Simple Approach for Non-stationary Linear BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Peng Zhao
Lijun Zhang
Yuan Jiang
Zhi Zhou
213
92
0
09 Mar 2021
Pareto Optimal Model Selection in Linear Bandits
Pareto Optimal Model Selection in Linear BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Yinglun Zhu
Robert D. Nowak
188
14
0
12 Feb 2021
Finding the Stochastic Shortest Path with Low Regret: The Adversarial
  Cost and Unknown Transition Case
Finding the Stochastic Shortest Path with Low Regret: The Adversarial Cost and Unknown Transition CaseInternational Conference on Machine Learning (ICML), 2021
Liyu Chen
Haipeng Luo
319
32
0
10 Feb 2021
Nonstochastic Bandits with Infinitely Many Experts
Nonstochastic Bandits with Infinitely Many ExpertsIEEE Conference on Decision and Control (CDC), 2021
X. Meng
Tuhin Sarkar
M. Dahleh
OffRL
137
1
0
09 Feb 2021
Tactical Optimism and Pessimism for Deep Reinforcement Learning
Tactical Optimism and Pessimism for Deep Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Theodore H. Moskovitz
Jack Parker-Holder
Aldo Pacchiano
Michael Arbel
Sai Li
291
69
0
07 Feb 2021
Online Markov Decision Processes with Aggregate Bandit Feedback
Online Markov Decision Processes with Aggregate Bandit FeedbackAnnual Conference Computational Learning Theory (COLT), 2021
Alon Cohen
Haim Kaplan
Tomer Koren
Yishay Mansour
OffRL
216
9
0
31 Jan 2021
Upper Confidence Bounds for Combining Stochastic Bandits
Upper Confidence Bounds for Combining Stochastic Bandits
Ashok Cutkosky
Abhimanyu Das
Manish Purohit
170
9
0
24 Dec 2020
Regret Bound Balancing and Elimination for Model Selection in Bandits
  and RL
Regret Bound Balancing and Elimination for Model Selection in Bandits and RL
Aldo Pacchiano
Christoph Dann
Claudio Gentile
Peter L. Bartlett
274
53
0
24 Dec 2020
Policy Optimization as Online Learning with Mediator Feedback
Policy Optimization as Online Learning with Mediator FeedbackAAAI Conference on Artificial Intelligence (AAAI), 2020
Alberto Maria Metelli
Matteo Papini
P. DÓro
Marcello Restelli
OffRL
226
11
0
15 Dec 2020
Smooth Bandit Optimization: Generalization to Hölder Space
Smooth Bandit Optimization: Generalization to Hölder SpaceInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Yusha Liu
Yining Wang
Aarti Singh
151
15
0
11 Dec 2020
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and
  Known Transition
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition
Liyu Chen
Haipeng Luo
Chen-Yu Wei
437
35
0
07 Dec 2020
Online Model Selection: a Rested Bandit Formulation
Online Model Selection: a Rested Bandit Formulation
Leonardo Cella
Claudio Gentile
Massimiliano Pontil
183
0
0
07 Dec 2020
Online Model Selection for Reinforcement Learning with Function
  Approximation
Online Model Selection for Reinforcement Learning with Function ApproximationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Jonathan Lee
Aldo Pacchiano
Vidya Muthukumar
Weihao Kong
Emma Brunskill
OffRL
205
37
0
19 Nov 2020
A New Bandit Setting Balancing Information from State Evolution and
  Corrupted Context
A New Bandit Setting Balancing Information from State Evolution and Corrupted ContextData mining and knowledge discovery (DMKD), 2020
Alexander Galozy
Sławomir Nowaczyk
Mattias Ohlsson
OffRL
246
2
0
16 Nov 2020
Multitask Bandit Learning Through Heterogeneous Feedback Aggregation
Multitask Bandit Learning Through Heterogeneous Feedback AggregationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Zhi Wang
Chicheng Zhang
Manish Singh
L. Riek
Kamalika Chaudhuri
378
25
0
29 Oct 2020
Tractable contextual bandits beyond realizability
Tractable contextual bandits beyond realizabilityInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Sanath Kumar Krishnamurthy
Vitor Hadad
Susan Athey
213
8
0
25 Oct 2020
Nonstationary Reinforcement Learning with Linear Function Approximation
Nonstationary Reinforcement Learning with Linear Function Approximation
Huozhi Zhou
Jinglin Chen
Lav Varshney
A. Jagmohan
315
31
0
08 Oct 2020
Regret Bounds and Reinforcement Learning Exploration of EXP-based
  Algorithms
Regret Bounds and Reinforcement Learning Exploration of EXP-based Algorithms
Mengfan Xu
Diego Klabjan
OffRL
239
1
0
20 Sep 2020
Open Problem: Model Selection for Contextual Bandits
Open Problem: Model Selection for Contextual Bandits
Dylan J. Foster
A. Krishnamurthy
Haipeng Luo
119
19
0
19 Jun 2020
Corralling Stochastic Bandit Algorithms
Corralling Stochastic Bandit Algorithms
R. Arora
T. V. Marinov
M. Mohri
233
35
0
16 Jun 2020
Bias no more: high-probability data-dependent regret bounds for
  adversarial bandits and MDPs
Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPsNeural Information Processing Systems (NeurIPS), 2020
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
Mengxiao Zhang
330
58
0
14 Jun 2020
Efficient Contextual Bandits with Continuous Actions
Efficient Contextual Bandits with Continuous ActionsNeural Information Processing Systems (NeurIPS), 2020
Maryam Majzoubi
Chicheng Zhang
Rajan Chari
A. Krishnamurthy
John Langford
Aleksandrs Slivkins
OffRL
232
34
0
10 Jun 2020
Regret Balancing for Bandit and RL Model Selection
Regret Balancing for Bandit and RL Model Selection
Yasin Abbasi-Yadkori
Aldo Pacchiano
My Phan
173
28
0
09 Jun 2020
Rate-adaptive model selection over a collection of black-box contextual
  bandit algorithms
Rate-adaptive model selection over a collection of black-box contextual bandit algorithms
Aurélien F. Bibaut
Antoine Chambaz
Mark van der Laan
160
6
0
05 Jun 2020
Problem-Complexity Adaptive Model Selection for Stochastic Linear
  Bandits
Problem-Complexity Adaptive Model Selection for Stochastic Linear BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Avishek Ghosh
Abishek Sankararaman
Kannan Ramchandran
249
35
0
04 Jun 2020
Model Selection in Contextual Stochastic Bandit Problems
Model Selection in Contextual Stochastic Bandit ProblemsNeural Information Processing Systems (NeurIPS), 2020
Aldo Pacchiano
My Phan
Yasin Abbasi-Yadkori
Anup B. Rao
Julian Zimmert
Tor Lattimore
Csaba Szepesvári
493
98
0
03 Mar 2020
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback
A Closer Look at Small-loss Bounds for Bandits with Graph FeedbackAnnual Conference Computational Learning Theory (COLT), 2020
Chung-Wei Lee
Haipeng Luo
Mengxiao Zhang
188
24
0
02 Feb 2020
Previous
123
Next