All Papers

0 / 0 papers shown

Title

Title
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits Xuheng Li Quanquan Gu 60 0 0 03 Nov 2025
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes Jasmine Bayrooti Sattar Vakili Amanda Prorok Carl Henrik Ek 100 0 0 23 Oct 2025
Q-learning with Posterior Sampling Priyank Agrawal Shipra Agrawal Azmat Azati OffRL GP 234 1 0 01 Jun 2025
When a Reinforcement Learning Agent Encounters Unknown Unknowns Juntian Zhu Miguel de Carvalho Zhouwang Yang Fengxiang He 235 0 0 19 May 2025
A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or SubgoalsInternational Conference on Learning Representations (ICLR), 2024 Grace Liu Michael Tang Benjamin Eysenbach OffRL 359 8 0 11 Aug 2024
Misspecified $Q$ -Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error Ally Yalei Du Lin F. Yang Ruosong Wang 186 0 0 18 Jul 2024
Satisficing Exploration for Deep Reinforcement Learning Dilip Arumugam Saurabh Kumar Ramki Gummadi Benjamin Van Roy 212 3 0 16 Jul 2024
More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling Haque Ishfaq Yixin Tan Yu Yang Qingfeng Lan Jianfeng Lu A. Rupam Mahmood Doina Precup Pan Xu 158 8 0 18 Jun 2024
Bayesian Design Principles for Offline-to-Online Reinforcement Learning Haotian Hu Yiqin Yang Jianing Ye Chengjie Wu Ziqing Mai Yujing Hu Tangjie Lv Changjie Fan Qianchuan Zhao Chongjie Zhang OffRL OnRL 195 7 0 31 May 2024
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation Jianliang He Han Zhong Zhuoran Yang 183 6 0 19 Apr 2024
Regret Minimization via Saddle Point OptimizationNeural Information Processing Systems (NeurIPS), 2024 Johannes Kirschner Seyed Alireza Bakhtiari Kushagra Chandak Volodymyr Tkachuk Csaba Szepesvári 144 2 0 15 Mar 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent Yingru Li Jiawei Xu Lei Han Zhi-Quan Luo BDL OffRL 242 7 0 05 Feb 2024
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond Thanh Nguyen-Tang Raman Arora OffRL 226 5 0 06 Jan 2024
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation Jiayi Huang Han Zhong Liwei Wang Lin F. Yang 147 3 0 07 Dec 2023
Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Ahmadreza Moradipari M. Pedramfar Modjtaba Shokrian Zini Vaneet Aggarwal 261 6 0 30 Oct 2023
Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023 Mirco Mutti Ric De Santi Marcello Restelli Alexander Marx Giorgia Ramponi CML 240 5 0 11 Oct 2023
Sample-Efficient Multi-Agent RL: An Optimization PerspectiveInternational Conference on Learning Representations (ICLR), 2023 Nuoya Xiong Zhihan Liu Zhaoran Wang Zhuoran Yang 235 1 0 10 Oct 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and ExplorationNeural Information Processing Systems (NeurIPS), 2023 Zhihan Liu Miao Lu Wei Xiong Han Zhong Haotian Hu Shenao Zhang Sirui Zheng Zhuoran Yang Zhaoran Wang OffRL 316 24 0 29 May 2023
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte CarloInternational Conference on Learning Representations (ICLR), 2023 Haque Ishfaq Qingfeng Lan Pan Xu A. R. Mahmood Doina Precup Anima Anandkumar Kamyar Azizzadenesheli BDL OffRL 264 27 0 29 May 2023
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Kaiwen Wang Kevin Zhou Runzhe Wu Nathan Kallus Wen Sun OffRL 402 23 0 25 May 2023
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale Botao Hao Rahul Jain Dengwang Tang Zheng Wen OffRL 158 5 0 20 Mar 2023
Eluder-based Regret for Stochastic Contextual MDPsInternational Conference on Machine Learning (ICML), 2022 Orin Levy Asaf B. Cassel Alon Cohen Yishay Mansour 230 7 0 27 Nov 2022
Model-Free Reinforcement Learning with the Decision-Estimation CoefficientNeural Information Processing Systems (NeurIPS), 2022 Dylan J. Foster Noah Golowich Jian Qian Alexander Rakhlin Ayush Sekhari OffRL 193 12 0 25 Nov 2022
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2022 Wei Xiong Han Zhong Chengshuai Shi Cong Shen Tong Zhang 160 21 0 04 Oct 2022
Guarantees for Epsilon-Greedy Reinforcement Learning with Function ApproximationInternational Conference on Machine Learning (ICML), 2022 Christoph Dann Yishay Mansour M. Mohri Ayush Sekhari Karthik Sridharan 210 67 0 19 Jun 2022
Regret Bounds for Information-Directed Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022 Botao Hao Tor Lattimore OffRL 234 23 0 09 Jun 2022
Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior SamplingAnnual Conference Computational Learning Theory (COLT), 2022 Alekh Agarwal Tong Zhang 151 9 0 15 Mar 2022
Fast Rates in Pool-Based Batch Active Learning Claudio Gentile Zhilei Wang Tong Zhang 275 18 0 11 Feb 2022
Nonstationary Reinforcement Learning with Linear Function Approximation Huozhi Zhou Jinglin Chen Lav Varshney A. Jagmohan 299 31 0 08 Oct 2020

Title

Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits

Xuheng Li

Quanquan Gu

03 Nov 2025

No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes

100

23 Oct 2025

Q-learning with Posterior Sampling

234

01 Jun 2025

When a Reinforcement Learning Agent Encounters Unknown Unknowns

235

19 May 2025

A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or SubgoalsInternational Conference on Learning Representations (ICLR), 2024

359

11 Aug 2024

Misspecified

Q

-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error

Ally Yalei Du

Lin F. Yang

Ruosong Wang

186

18 Jul 2024

Satisficing Exploration for Deep Reinforcement Learning

212

16 Jul 2024

More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling

Jianfeng Lu

A. Rupam Mahmood

Doina Precup

Pan Xu

158

18 Jun 2024

Bayesian Design Principles for Offline-to-Online Reinforcement Learning

Changjie Fan

195

31 May 2024

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation

Jianliang He

Han Zhong

Zhuoran Yang

183

19 Apr 2024

Regret Minimization via Saddle Point OptimizationNeural Information Processing Systems (NeurIPS), 2024

Johannes Kirschner

Seyed Alireza Bakhtiari

Kushagra Chandak

Volodymyr Tkachuk

Csaba Szepesvári

144

15 Mar 2024

Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent

242

05 Feb 2024

On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond

Thanh Nguyen-Tang

Raman Arora

OffRL

226

06 Jan 2024

Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation

147

07 Dec 2023

Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

Ahmadreza Moradipari

M. Pedramfar

Modjtaba Shokrian Zini

Vaneet Aggarwal

261

30 Oct 2023

Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023

240

11 Oct 2023

Sample-Efficient Multi-Agent RL: An Optimization PerspectiveInternational Conference on Learning Representations (ICLR), 2023

235

10 Oct 2023

Maximize to Explore: One Objective Function Fusing Estimation, Planning, and ExplorationNeural Information Processing Systems (NeurIPS), 2023

Wei Xiong

316

29 May 2023

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte CarloInternational Conference on Learning Representations (ICLR), 2023

Kamyar Azizzadenesheli

BDL OffRL

264

29 May 2023

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

402

25 May 2023

Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale

158

20 Mar 2023

Eluder-based Regret for Stochastic Contextual MDPsInternational Conference on Machine Learning (ICML), 2022

230

27 Nov 2022

Model-Free Reinforcement Learning with the Decision-Estimation CoefficientNeural Information Processing Systems (NeurIPS), 2022

193

25 Nov 2022

A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2022

Wei Xiong

Han Zhong

Chengshuai Shi

Cong Shen

Tong Zhang

160

04 Oct 2022

Guarantees for Epsilon-Greedy Reinforcement Learning with Function ApproximationInternational Conference on Machine Learning (ICML), 2022

210

19 Jun 2022

Regret Bounds for Information-Directed Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022

Botao Hao

Tor Lattimore

OffRL

234

09 Jun 2022

Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior SamplingAnnual Conference Computational Learning Theory (COLT), 2022

Alekh Agarwal

Tong Zhang

151

15 Mar 2022

Fast Rates in Pool-Based Batch Active Learning

Claudio Gentile

Zhilei Wang

Tong Zhang

275

11 Feb 2022

Nonstationary Reinforcement Learning with Linear Function Approximation

299

08 Oct 2020

Title
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits Xuheng Li Quanquan Gu 60 0 0 03 Nov 2025
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes Jasmine Bayrooti Sattar Vakili Amanda Prorok Carl Henrik Ek 100 0 0 23 Oct 2025
Q-learning with Posterior Sampling Priyank Agrawal Shipra Agrawal Azmat Azati OffRL GP 234 1 0 01 Jun 2025
When a Reinforcement Learning Agent Encounters Unknown Unknowns Juntian Zhu Miguel de Carvalho Zhouwang Yang Fengxiang He 235 0 0 19 May 2025
A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or SubgoalsInternational Conference on Learning Representations (ICLR), 2024 Grace Liu Michael Tang Benjamin Eysenbach OffRL 359 8 0 11 Aug 2024
Misspecified $Q$ -Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error Ally Yalei Du Lin F. Yang Ruosong Wang 186 0 0 18 Jul 2024
Satisficing Exploration for Deep Reinforcement Learning Dilip Arumugam Saurabh Kumar Ramki Gummadi Benjamin Van Roy 212 3 0 16 Jul 2024
More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling Haque Ishfaq Yixin Tan Yu Yang Qingfeng Lan Jianfeng Lu A. Rupam Mahmood Doina Precup Pan Xu 158 8 0 18 Jun 2024
Bayesian Design Principles for Offline-to-Online Reinforcement Learning Haotian Hu Yiqin Yang Jianing Ye Chengjie Wu Ziqing Mai Yujing Hu Tangjie Lv Changjie Fan Qianchuan Zhao Chongjie Zhang OffRL OnRL 195 7 0 31 May 2024
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation Jianliang He Han Zhong Zhuoran Yang 183 6 0 19 Apr 2024
Regret Minimization via Saddle Point OptimizationNeural Information Processing Systems (NeurIPS), 2024 Johannes Kirschner Seyed Alireza Bakhtiari Kushagra Chandak Volodymyr Tkachuk Csaba Szepesvári 144 2 0 15 Mar 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent Yingru Li Jiawei Xu Lei Han Zhi-Quan Luo BDL OffRL 242 7 0 05 Feb 2024
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond Thanh Nguyen-Tang Raman Arora OffRL 226 5 0 06 Jan 2024
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation Jiayi Huang Han Zhong Liwei Wang Lin F. Yang 147 3 0 07 Dec 2023
Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Ahmadreza Moradipari M. Pedramfar Modjtaba Shokrian Zini Vaneet Aggarwal 261 6 0 30 Oct 2023
Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023 Mirco Mutti Ric De Santi Marcello Restelli Alexander Marx Giorgia Ramponi CML 240 5 0 11 Oct 2023
Sample-Efficient Multi-Agent RL: An Optimization PerspectiveInternational Conference on Learning Representations (ICLR), 2023 Nuoya Xiong Zhihan Liu Zhaoran Wang Zhuoran Yang 235 1 0 10 Oct 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and ExplorationNeural Information Processing Systems (NeurIPS), 2023 Zhihan Liu Miao Lu Wei Xiong Han Zhong Haotian Hu Shenao Zhang Sirui Zheng Zhuoran Yang Zhaoran Wang OffRL 316 24 0 29 May 2023
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte CarloInternational Conference on Learning Representations (ICLR), 2023 Haque Ishfaq Qingfeng Lan Pan Xu A. R. Mahmood Doina Precup Anima Anandkumar Kamyar Azizzadenesheli BDL OffRL 264 27 0 29 May 2023
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Kaiwen Wang Kevin Zhou Runzhe Wu Nathan Kallus Wen Sun OffRL 402 23 0 25 May 2023
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale Botao Hao Rahul Jain Dengwang Tang Zheng Wen OffRL 158 5 0 20 Mar 2023
Eluder-based Regret for Stochastic Contextual MDPsInternational Conference on Machine Learning (ICML), 2022 Orin Levy Asaf B. Cassel Alon Cohen Yishay Mansour 230 7 0 27 Nov 2022
Model-Free Reinforcement Learning with the Decision-Estimation CoefficientNeural Information Processing Systems (NeurIPS), 2022 Dylan J. Foster Noah Golowich Jian Qian Alexander Rakhlin Ayush Sekhari OffRL 193 12 0 25 Nov 2022
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2022 Wei Xiong Han Zhong Chengshuai Shi Cong Shen Tong Zhang 160 21 0 04 Oct 2022
Guarantees for Epsilon-Greedy Reinforcement Learning with Function ApproximationInternational Conference on Machine Learning (ICML), 2022 Christoph Dann Yishay Mansour M. Mohri Ayush Sekhari Karthik Sridharan 210 67 0 19 Jun 2022
Regret Bounds for Information-Directed Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022 Botao Hao Tor Lattimore OffRL 234 23 0 09 Jun 2022
Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior SamplingAnnual Conference Computational Learning Theory (COLT), 2022 Alekh Agarwal Tong Zhang 151 9 0 15 Mar 2022
Fast Rates in Pool-Based Batch Active Learning Claudio Gentile Zhilei Wang Tong Zhang 275 18 0 11 Feb 2022
Nonstationary Reinforcement Learning with Linear Function Approximation Huozhi Zhou Jinglin Chen Lav Varshney A. Jagmohan 299 31 0 08 Oct 2020

Title

Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits

Xuheng Li

Quanquan Gu

03 Nov 2025

No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes

100

23 Oct 2025