v1v2v3 (latest)

Corralling a Band of Bandit Algorithms

Annual Conference Computational Learning Theory (COLT), 2016

19 December 2016

Papers citing "Corralling a Band of Bandit Algorithms"

50 / 121 papers shown

Title
Damped Online Newton Step for Portfolio SelectionAnnual Conference Computational Learning Theory (COLT), 2022 Zakaria Mhammedi Alexander Rakhlin 110 16 0 15 Feb 2022
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear BanditsAnnual Conference Computational Learning Theory (COLT), 2022 Haipeng Luo Mengxiao Zhang Peng Zhao Zhi Zhou 198 20 0 12 Feb 2022
Adaptive Bandit Convex Optimization with Heterogeneous CurvatureAnnual Conference Computational Learning Theory (COLT), 2022 Haipeng Luo Mengxiao Zhang Penghui Zhao 196 5 0 12 Feb 2022
Model Selection in Batch Policy OptimizationInternational Conference on Machine Learning (ICML), 2021 Jonathan Lee George Tucker Ofir Nachum Bo Dai OffRL 200 12 0 23 Dec 2021
Uncoupled Bandit Learning towards Rationalizability: Benchmarks, Barriers, and Algorithms Jibang Wu Haifeng Xu Fan Yao 272 0 0 10 Nov 2021
Misspecified Gaussian Process Bandit OptimizationNeural Information Processing Systems (NeurIPS), 2021 Ilija Bogunovic Andreas Krause 181 53 0 09 Nov 2021
Universal and data-adaptive algorithms for model selection in linear contextual bandits Vidya Muthukumar A. Krishnamurthy 223 5 0 08 Nov 2021
Decentralized Cooperative Reinforcement Learning with Hierarchical Information StructureInternational Conference on Algorithmic Learning Theory (ALT), 2021 Hsu Kao Chen-Yu Wei V. Subramanian 299 15 0 01 Nov 2021
Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic RegularizersNeural Information Processing Systems (NeurIPS), 2021 Jeffrey Negrea Blair Bilodeau Nicolò Campolongo Francesco Orabona Daniel M. Roy 239 9 0 27 Oct 2021
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays Jiatai Huang Yan Dai Longbo Huang AI4CE 333 2 0 26 Oct 2021
The Pareto Frontier of model selection for general Contextual Bandits T. V. Marinov Julian Zimmert 211 25 0 25 Oct 2021
Linear Contextual Bandits with Adversarial Corruptions Heyang Zhao Dongruo Zhou Quanquan Gu AAML 207 24 0 25 Oct 2021
Model Selection for Generic Reinforcement Learning Avishek Ghosh Sayak Ray Chowdhury Kannan Ramchandran 166 1 0 13 Jul 2021
Adapting to Misspecification in Contextual Bandits Dylan J. Foster Claudio Gentile M. Mohri Julian Zimmert 189 95 0 12 Jul 2021
Model Selection for Generic Contextual BanditsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2021 Avishek Ghosh Abishek Sankararaman Kannan Ramchandran 256 7 0 07 Jul 2021
On component interactions in two-stage recommender systemsNeural Information Processing Systems (NeurIPS), 2021 Jiri Hron K. Krauth Sai Li Niki Kilbertus CML LRM 180 35 0 28 Jun 2021
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RLConference on Uncertainty in Artificial Intelligence (UAI), 2021 Weitong Zhang Jiafan He Dongruo Zhou Amy Zhang Quanquan Gu OffRL 235 12 0 22 Jun 2021
Towards Costless Model Selection in Contextual Bandits: A Bias-Variance PerspectiveInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021 Sanath Kumar Krishnamurthy Adrienne Margaret Propp Susan Athey 242 3 0 11 Jun 2021
Thompson Sampling with a Mixture PriorInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021 Joey Hong Branislav Kveton Manzil Zaheer Mohammad Ghavamzadeh Craig Boutilier 249 14 0 10 Jun 2021
Feature and Parameter Selection in Stochastic Linear BanditsInternational Conference on Machine Learning (ICML), 2021 Ahmadreza Moradipari Berkay Turan Yasin Abbasi-Yadkori M. Alizadeh Mohammad Ghavamzadeh 363 6 0 09 Jun 2021
Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit AlgorithmsNeural Information Processing Systems (NeurIPS), 2021 Qin Ding Yue Kang Yi-Wei Liu Thomas C. M. Lee Cho-Jui Hsieh James Sharpnack 235 12 0 05 Jun 2021
Multi-armed Bandit Algorithms on System-on-Chip: Go Frequentist or Bayesian? S. Santosh S. Darak 163 0 0 05 Jun 2021
Leveraging Good Representations in Linear Contextual BanditsInternational Conference on Machine Learning (ICML), 2021 Matteo Papini Andrea Tirinzoni Marcello Restelli A. Lazaric Matteo Pirotta 164 31 0 08 Apr 2021
A Simple Approach for Non-stationary Linear BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Peng Zhao Lijun Zhang Yuan Jiang Zhi Zhou 213 92 0 09 Mar 2021
Pareto Optimal Model Selection in Linear BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021 Yinglun Zhu Robert D. Nowak 188 14 0 12 Feb 2021
Finding the Stochastic Shortest Path with Low Regret: The Adversarial Cost and Unknown Transition CaseInternational Conference on Machine Learning (ICML), 2021 Liyu Chen Haipeng Luo 319 32 0 10 Feb 2021
Nonstochastic Bandits with Infinitely Many ExpertsIEEE Conference on Decision and Control (CDC), 2021 X. Meng Tuhin Sarkar M. Dahleh OffRL 137 1 0 09 Feb 2021
Tactical Optimism and Pessimism for Deep Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021 Theodore H. Moskovitz Jack Parker-Holder Aldo Pacchiano Michael Arbel Sai Li 291 69 0 07 Feb 2021
Online Markov Decision Processes with Aggregate Bandit FeedbackAnnual Conference Computational Learning Theory (COLT), 2021 Alon Cohen Haim Kaplan Tomer Koren Yishay Mansour OffRL 216 9 0 31 Jan 2021
Upper Confidence Bounds for Combining Stochastic Bandits Ashok Cutkosky Abhimanyu Das Manish Purohit 170 9 0 24 Dec 2020
Regret Bound Balancing and Elimination for Model Selection in Bandits and RL Aldo Pacchiano Christoph Dann Claudio Gentile Peter L. Bartlett 274 53 0 24 Dec 2020
Policy Optimization as Online Learning with Mediator FeedbackAAAI Conference on Artificial Intelligence (AAAI), 2020 Alberto Maria Metelli Matteo Papini P. DÓro Marcello Restelli OffRL 226 11 0 15 Dec 2020
Smooth Bandit Optimization: Generalization to Hölder SpaceInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Yusha Liu Yining Wang Aarti Singh 151 15 0 11 Dec 2020
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition Liyu Chen Haipeng Luo Chen-Yu Wei 437 35 0 07 Dec 2020
Online Model Selection: a Rested Bandit Formulation Leonardo Cella Claudio Gentile Massimiliano Pontil 183 0 0 07 Dec 2020
Online Model Selection for Reinforcement Learning with Function ApproximationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Jonathan Lee Aldo Pacchiano Vidya Muthukumar Weihao Kong Emma Brunskill OffRL 205 37 0 19 Nov 2020
A New Bandit Setting Balancing Information from State Evolution and Corrupted ContextData mining and knowledge discovery (DMKD), 2020 Alexander Galozy Sławomir Nowaczyk Mattias Ohlsson OffRL 246 2 0 16 Nov 2020
Multitask Bandit Learning Through Heterogeneous Feedback AggregationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Zhi Wang Chicheng Zhang Manish Singh L. Riek Kamalika Chaudhuri 378 25 0 29 Oct 2020
Tractable contextual bandits beyond realizabilityInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Sanath Kumar Krishnamurthy Vitor Hadad Susan Athey 213 8 0 25 Oct 2020
Nonstationary Reinforcement Learning with Linear Function Approximation Huozhi Zhou Jinglin Chen Lav Varshney A. Jagmohan 315 31 0 08 Oct 2020
Regret Bounds and Reinforcement Learning Exploration of EXP-based Algorithms Mengfan Xu Diego Klabjan OffRL 239 1 0 20 Sep 2020
Open Problem: Model Selection for Contextual Bandits Dylan J. Foster A. Krishnamurthy Haipeng Luo 119 19 0 19 Jun 2020
Corralling Stochastic Bandit Algorithms R. Arora T. V. Marinov M. Mohri 233 35 0 16 Jun 2020
Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPsNeural Information Processing Systems (NeurIPS), 2020 Chung-Wei Lee Haipeng Luo Chen-Yu Wei Mengxiao Zhang 330 58 0 14 Jun 2020
Efficient Contextual Bandits with Continuous ActionsNeural Information Processing Systems (NeurIPS), 2020 Maryam Majzoubi Chicheng Zhang Rajan Chari A. Krishnamurthy John Langford Aleksandrs Slivkins OffRL 232 34 0 10 Jun 2020
Regret Balancing for Bandit and RL Model Selection Yasin Abbasi-Yadkori Aldo Pacchiano My Phan 173 28 0 09 Jun 2020
Rate-adaptive model selection over a collection of black-box contextual bandit algorithms Aurélien F. Bibaut Antoine Chambaz Mark van der Laan 160 6 0 05 Jun 2020
Problem-Complexity Adaptive Model Selection for Stochastic Linear BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Avishek Ghosh Abishek Sankararaman Kannan Ramchandran 249 35 0 04 Jun 2020
Model Selection in Contextual Stochastic Bandit ProblemsNeural Information Processing Systems (NeurIPS), 2020 Aldo Pacchiano My Phan Yasin Abbasi-Yadkori Anup B. Rao Julian Zimmert Tor Lattimore Csaba Szepesvári 493 98 0 03 Mar 2020
A Closer Look at Small-loss Bounds for Bandits with Graph FeedbackAnnual Conference Computational Learning Theory (COLT), 2020 Chung-Wei Lee Haipeng Luo Mengxiao Zhang 188 24 0 02 Feb 2020