v1v2 (latest)

Taking a hint: How to leverage loss predictors in contextual bandits?

Annual Conference Computational Learning Theory (COLT), 2020

4 March 2020

Papers citing "Taking a hint: How to leverage loss predictors in contextual bandits?"

38 / 38 papers shown

Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits

Xuheng Li

Quanquan Gu

153

03 Nov 2025

Efficiently Solving Discounted MDPs with Predictions on Transition Matrices

Lixing Lyu

Jiashuo Jiang

Wang Chi Cheung

345

24 Feb 2025

Catoni Contextual Bandits are Robust to Heavy-tailed Rewards

491

04 Feb 2025

How Does Variance Shape the Regret in Contextual Bandits?Neural Information Processing Systems (NeurIPS), 2024

504

16 Oct 2024

A Parametric Contextual Online Learning Theory of Brokerage

François Bachoc

Tommaso Cesari

Roberto Colomboni

324

22 May 2024

Online Bandits with (Biased) Offline Data: Adaptive Learning under Distribution MismatchInternational Conference on Machine Learning (ICML), 2024

Wang Chi Cheung

Lixing Lyu

OffRL

481

04 May 2024

Online Learning in Contextual Second-Price Pay-Per-Click AuctionsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

Mengxiao Zhang

Haipeng Luo

322

08 Oct 2023

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarityInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

374

27 Jul 2023

Online Resource Allocation: Bandits feedback and Advice on Time-varying Demands

Lixing Lyu

Wang Chi Cheung

301

08 Feb 2023

Leveraging the Hints: Adaptive Bidding in Repeated First-Price AuctionsNeural Information Processing Systems (NeurIPS), 2022

217

05 Nov 2022

Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler

Karthik Sankararaman

Sinong Wang

Han Fang

155

04 Nov 2022

Leveraging Initial Hints for Free in Stochastic Linear BanditsInternational Conference on Algorithmic Learning Theory (ALT), 2022

184

08 Mar 2022

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation ApproachInternational Conference on Machine Learning (ICML), 2021

Max Simchowitz

359

07 Dec 2021

Fast Rates for Nonparametric Online Learning: From Realizability to Learning in Games

C. Daskalakis

Noah Golowich

325

17 Nov 2021

Can Q-Learning be Improved with Advice?Annual Conference Computational Learning Theory (COLT), 2021

Noah Golowich

Ankur Moitra

OffRL

398

25 Oct 2021

Corruption Robust Active LearningNeural Information Processing Systems (NeurIPS), 2021

Yifang Chen

S. Du

Kevin Jamieson

206

21 Jun 2021

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits SimultaneouslyInternational Conference on Machine Learning (ICML), 2021

281

11 Feb 2021

Multitask Bandit Learning Through Heterogeneous Feedback AggregationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020

480

29 Oct 2020

Mean estimation and regression under heavy-tailed distributions--a surveyFoundations of Computational Mathematics (FoCM), 2019

Gabor Lugosi

S. Mendelson

331

284

10 Jun 2019

Equipping Experts/Bandits with Long-term MemoryNeural Information Processing Systems (NeurIPS), 2019

207

30 May 2019

Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting

475

05 Feb 2019

A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free

369

143

03 Feb 2019

Improved Path-length Regret Bounds for BanditsAnnual Conference Computational Learning Theory (COLT), 2019

362

29 Jan 2019

Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits

296

09 Feb 2018

More Adaptive Algorithms for Adversarial Bandits

Chen-Yu Wei

Haipeng Luo

693

201

10 Jan 2018

Tracking the Best Expert in Non-stationary Stochastic Environments

Chen-Yu Wei

Yi-Te Hong

Chi-Jen Lu

281

02 Dec 2017

Improved Regret Bounds for Oracle-Based Adversarial Contextual BanditsNeural Information Processing Systems (NeurIPS), 2016

284

01 Jun 2016

Efficient Algorithms for Adversarial Contextual Learning

Vasilis Syrgkanis

A. Krishnamurthy

Robert Schapire

361

08 Feb 2016

Fast Convergence of Regularized Learning in Games

605

301

02 Jul 2015

The Computational Power of Optimization in Online LearningSymposium on the Theory of Computing (STOC), 2015

Elad Hazan

Tomer Koren

563

08 Apr 2015

Doubly Robust Policy Evaluation and Optimization

426

311

10 Mar 2015

Strongly Adaptive Online Learning

650

193

25 Feb 2015

Taming the Monster: A Fast and Simple Algorithm for Contextual BanditsInternational Conference on Machine Learning (ICML), 2014

995

543

04 Feb 2014

Optimization, Learning, and Games with Predictable SequencesNeural Information Processing Systems (NeurIPS), 2013

Alexander Rakhlin

Karthik Sridharan

538

416

08 Nov 2013

Online Learning with Predictable SequencesAnnual Conference Computational Learning Theory (COLT), 2012

Alexander Rakhlin

Karthik Sridharan

572

410

18 Aug 2012

Efficient Optimal Learning for Contextual BanditsConference on Uncertainty in Artificial Intelligence (UAI), 2011

Tong Zhang

427

320

13 Jun 2011

Challenging the empirical mean and empirical variance: a deviation study

O. Catoni

669

505

10 Sep 2010

Contextual Bandit Algorithms with Supervised Learning GuaranteesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2010

582

346

22 Feb 2010