Regret Bound Balancing and Elimination for Model Selection in Bandits and RL

24 December 2020

Papers citing "Regret Bound Balancing and Elimination for Model Selection in Bandits and RL"

42 / 42 papers shown

Improved Training Mechanism for Reinforcement Learning via Online Model Selection

Aida Afshar

Aldo Pacchiano

01 Dec 2025

A Model Selection Approach for Corruption Robust Reinforcement LearningInternational Conference on Algorithmic Learning Theory (ALT), 2021

Chen-Yu Wei

Christoph Dann

Julian Zimmert

390

31 Dec 2024

Model Selection for Average Reward RL with Application to Utility Maximization in Repeated Games

Alireza Masoumian

James R. Wright

564

09 Nov 2024

Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to OptimalNeural Information Processing Systems (NeurIPS), 2024

Juliusz Ziomek

Masaki Adachi

Michael A. Osborne

501

14 Oct 2024

Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives

Aida Afshar

Aldo Pacchiano

234

07 Aug 2024

Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals

252

01 Jul 2024

Sparsity-Agnostic Linear Bandits with Adaptive Adversaries

Tianyuan Jin

Kyoungseok Jang

Nicolò Cesa-Bianchi

271

03 Jun 2024

Symmetric Linear Bandits with Hidden Symmetry

417

22 May 2024

Experiment Planning with Function ApproximationNeural Information Processing Systems (NeurIPS), 2024

237

10 Jan 2024

Multitask Learning with No Regret: from Improved Confidence Bounds to Active LearningNeural Information Processing Systems (NeurIPS), 2023

245

03 Aug 2023

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear BanditsInternational Conference on Learning Representations (ICLR), 2023

Yuwei Luo

Mohsen Bayati

406

26 Jun 2023

Data-Driven Online Model Selection With Regret GuaranteesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

453

05 Jun 2023

Adaptation to Misspecified Kernel Regularity in Kernelised BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

Yusha Liu

Aarti Singh

356

26 Apr 2023

Data-Efficient Policy Selection for Navigation in Partial Maps via Subgoal-Based AbstractionIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023

Abhishek Paudel

Gregory J. Stein

238

03 Apr 2023

Estimating Optimal Policy Value in General Linear Contextual Bandits

274

19 Feb 2023

Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits

Yue Kang

Cho-Jui Hsieh

T. C. Lee

372

18 Feb 2023

Stochastic Rising BanditsInternational Conference on Machine Learning (ICML), 2022

Alberto Maria Metelli

F. Trovò

Matteo Pirola

Marcello Restelli

199

07 Dec 2022

Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample ComplexityNeural Information Processing Systems (NeurIPS), 2022

Abhishek Gupta

260

100

18 Oct 2022

Neural Design for Genetic Perturbation ExperimentsInternational Conference on Learning Representations (ICLR), 2022

318

26 Jul 2022

Exploration in Linear Bandits with Rich Action Sets and its Implications for InferenceInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022

337

23 Jul 2022

Model Selection in Reinforcement Learning with General Function Approximations

Avishek Ghosh

Sayak Ray Chowdhury

190

06 Jul 2022

Best of Both Worlds Model SelectionNeural Information Processing Systems (NeurIPS), 2022

Aldo Pacchiano

Christoph Dann

Claudio Gentile

245

29 Jun 2022

Joint Representation Training in Sequential Tasks with Shared Structure

294

24 Jun 2022

Provable Benefits of Representational Transfer in Reinforcement LearningAnnual Conference Computational Learning Theory (COLT), 2022

Mengdi Wang

358

29 May 2022

$Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits$

Breaking the

\sqrt{T}

Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear BanditsInternational Conference on Machine Learning (ICML), 2022

Avishek Ghosh

Abishek Sankararaman

229

19 May 2022

Neural Pseudo-Label Optimism for the Bank Loan ProblemNeural Information Processing Systems (NeurIPS), 2021

174

03 Dec 2021

Misspecified Gaussian Process Bandit OptimizationNeural Information Processing Systems (NeurIPS), 2021

Ilija Bogunovic

Andreas Krause

266

09 Nov 2021

Universal and data-adaptive algorithms for model selection in linear contextual bandits

Vidya Muthukumar

A. Krishnamurthy

307

08 Nov 2021

The Pareto Frontier of model selection for general Contextual Bandits

T. V. Marinov

Julian Zimmert

254

25 Oct 2021

Improved Algorithms for Misspecified Linear Markov Decision Processes

268

12 Sep 2021

Model Selection for Generic Reinforcement Learning

Avishek Ghosh

Sayak Ray Chowdhury

Kannan Ramchandran

267

13 Jul 2021

Model Selection for Generic Contextual BanditsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2021

Avishek Ghosh

Abishek Sankararaman

Kannan Ramchandran

307

07 Jul 2021

Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RLConference on Uncertainty in Artificial Intelligence (UAI), 2021

Quanquan Gu

307

22 Jun 2021

Towards Costless Model Selection in Contextual Bandits: A Bias-Variance PerspectiveInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Sanath Kumar Krishnamurthy

Adrienne Margaret Propp

Susan Athey

334

11 Jun 2021

Feature and Parameter Selection in Stochastic Linear BanditsInternational Conference on Machine Learning (ICML), 2021

426

09 Jun 2021

Neural Active Learning with Performance GuaranteesNeural Information Processing Systems (NeurIPS), 2021

195

06 Jun 2021

Leveraging Good Representations in Linear Contextual BanditsInternational Conference on Machine Learning (ICML), 2021

207

08 Apr 2021

Model-free Representation Learning and Exploration in Low-rank MDPsJournal of machine learning research (JMLR), 2021

376

14 Feb 2021

Pareto Optimal Model Selection in Linear BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Yinglun Zhu

Robert D. Nowak

273

12 Feb 2021

Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box ApproachAnnual Conference Computational Learning Theory (COLT), 2021

Chen-Yu Wei

Haipeng Luo

OffRL

512

129

10 Feb 2021

Tactical Optimism and Pessimism for Deep Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021

Theodore H. Moskovitz

492

07 Feb 2021

Model Selection in Contextual Stochastic Bandit ProblemsNeural Information Processing Systems (NeurIPS), 2020

588

103

03 Mar 2020