ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.10904
  4. Cited By
A Provably Efficient Model-Free Posterior Sampling Method for Episodic
  Reinforcement Learning

A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning

Neural Information Processing Systems (NeurIPS), 2022
23 August 2022
Christoph Dann
M. Mohri
Tong Zhang
Julian Zimmert
    OffRL
ArXiv (abs)PDFHTML

Papers citing "A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning"

29 / 29 papers shown
Title
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
Xuheng Li
Quanquan Gu
60
0
0
03 Nov 2025
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
Jasmine Bayrooti
Sattar Vakili
Amanda Prorok
Carl Henrik Ek
76
0
0
23 Oct 2025
Q-learning with Posterior Sampling
Q-learning with Posterior Sampling
Priyank Agrawal
Shipra Agrawal
Azmat Azati
OffRLGP
230
1
0
01 Jun 2025
When a Reinforcement Learning Agent Encounters Unknown Unknowns
When a Reinforcement Learning Agent Encounters Unknown Unknowns
Juntian Zhu
Miguel de Carvalho
Zhouwang Yang
Fengxiang He
227
0
0
19 May 2025
A Single Goal is All You Need: Skills and Exploration Emerge from
  Contrastive RL without Rewards, Demonstrations, or Subgoals
A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or SubgoalsInternational Conference on Learning Representations (ICLR), 2024
Grace Liu
Michael Tang
Benjamin Eysenbach
OffRL
351
8
0
11 Aug 2024
Misspecified $Q$-Learning with Sparse Linear Function Approximation:
  Tight Bounds on Approximation Error
Misspecified QQQ-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error
Ally Yalei Du
Lin F. Yang
Ruosong Wang
186
0
0
18 Jul 2024
Satisficing Exploration for Deep Reinforcement Learning
Satisficing Exploration for Deep Reinforcement Learning
Dilip Arumugam
Saurabh Kumar
Ramki Gummadi
Benjamin Van Roy
208
3
0
16 Jul 2024
More Efficient Randomized Exploration for Reinforcement Learning via
  Approximate Sampling
More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling
Haque Ishfaq
Yixin Tan
Yu Yang
Qingfeng Lan
Jianfeng Lu
A. Rupam Mahmood
Doina Precup
Pan Xu
158
8
0
18 Jun 2024
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Haotian Hu
Yiqin Yang
Jianing Ye
Chengjie Wu
Ziqing Mai
Yujing Hu
Tangjie Lv
Changjie Fan
Qianchuan Zhao
Chongjie Zhang
OffRLOnRL
195
7
0
31 May 2024
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with
  General Function Approximation
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
Jianliang He
Han Zhong
Zhuoran Yang
183
6
0
19 Apr 2024
Regret Minimization via Saddle Point Optimization
Regret Minimization via Saddle Point OptimizationNeural Information Processing Systems (NeurIPS), 2024
Johannes Kirschner
Seyed Alireza Bakhtiari
Kushagra Chandak
Volodymyr Tkachuk
Csaba Szepesvári
144
2
0
15 Mar 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice
  via HyperAgent
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li
Jiawei Xu
Lei Han
Zhi-Quan Luo
BDLOffRL
242
7
0
05 Feb 2024
On Sample-Efficient Offline Reinforcement Learning: Data Diversity,
  Posterior Sampling, and Beyond
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
Thanh Nguyen-Tang
Raman Arora
OffRL
226
5
0
06 Jan 2024
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement
  Learning with General Function Approximation
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation
Jiayi Huang
Han Zhong
Liwei Wang
Lin F. Yang
147
3
0
07 Dec 2023
Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement
  Learning
Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Ahmadreza Moradipari
M. Pedramfar
Modjtaba Shokrian Zini
Vaneet Aggarwal
261
6
0
30 Oct 2023
Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement
  Learning
Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Mirco Mutti
Ric De Santi
Marcello Restelli
Alexander Marx
Giorgia Ramponi
CML
236
5
0
11 Oct 2023
Sample-Efficient Multi-Agent RL: An Optimization Perspective
Sample-Efficient Multi-Agent RL: An Optimization PerspectiveInternational Conference on Learning Representations (ICLR), 2023
Nuoya Xiong
Zhihan Liu
Zhaoran Wang
Zhuoran Yang
231
1
0
10 Oct 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning,
  and Exploration
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and ExplorationNeural Information Processing Systems (NeurIPS), 2023
Zhihan Liu
Miao Lu
Wei Xiong
Han Zhong
Haotian Hu
Shenao Zhang
Sirui Zheng
Zhuoran Yang
Zhaoran Wang
OffRL
312
24
0
29 May 2023
Provable and Practical: Efficient Exploration in Reinforcement Learning
  via Langevin Monte Carlo
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte CarloInternational Conference on Learning Representations (ICLR), 2023
Haque Ishfaq
Qingfeng Lan
Pan Xu
A. R. Mahmood
Doina Precup
Anima Anandkumar
Kamyar Azizzadenesheli
BDLOffRL
252
27
0
29 May 2023
The Benefits of Being Distributional: Small-Loss Bounds for
  Reinforcement Learning
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Kaiwen Wang
Kevin Zhou
Runzhe Wu
Nathan Kallus
Wen Sun
OffRL
402
23
0
25 May 2023
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Botao Hao
Rahul Jain
Dengwang Tang
Zheng Wen
OffRL
158
5
0
20 Mar 2023
Eluder-based Regret for Stochastic Contextual MDPs
Eluder-based Regret for Stochastic Contextual MDPsInternational Conference on Machine Learning (ICML), 2022
Orin Levy
Asaf B. Cassel
Alon Cohen
Yishay Mansour
230
7
0
27 Nov 2022
Model-Free Reinforcement Learning with the Decision-Estimation
  Coefficient
Model-Free Reinforcement Learning with the Decision-Estimation CoefficientNeural Information Processing Systems (NeurIPS), 2022
Dylan J. Foster
Noah Golowich
Jian Qian
Alexander Rakhlin
Ayush Sekhari
OffRL
189
12
0
25 Nov 2022
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2022
Wei Xiong
Han Zhong
Chengshuai Shi
Cong Shen
Tong Zhang
156
21
0
04 Oct 2022
Guarantees for Epsilon-Greedy Reinforcement Learning with Function
  Approximation
Guarantees for Epsilon-Greedy Reinforcement Learning with Function ApproximationInternational Conference on Machine Learning (ICML), 2022
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
206
67
0
19 Jun 2022
Regret Bounds for Information-Directed Reinforcement Learning
Regret Bounds for Information-Directed Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Botao Hao
Tor Lattimore
OffRL
234
23
0
09 Jun 2022
Non-Linear Reinforcement Learning in Large Action Spaces: Structural
  Conditions and Sample-efficiency of Posterior Sampling
Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior SamplingAnnual Conference Computational Learning Theory (COLT), 2022
Alekh Agarwal
Tong Zhang
147
9
0
15 Mar 2022
Fast Rates in Pool-Based Batch Active Learning
Fast Rates in Pool-Based Batch Active Learning
Claudio Gentile
Zhilei Wang
Tong Zhang
259
18
0
11 Feb 2022
Nonstationary Reinforcement Learning with Linear Function Approximation
Nonstationary Reinforcement Learning with Linear Function Approximation
Huozhi Zhou
Jinglin Chen
Lav Varshney
A. Jagmohan
299
31
0
08 Oct 2020
1