ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1306.0940
  4. Cited By
(More) Efficient Reinforcement Learning via Posterior Sampling
v1v2v3v4v5 (latest)

(More) Efficient Reinforcement Learning via Posterior Sampling

Neural Information Processing Systems (NeurIPS), 2013
4 June 2013
Ian Osband
Daniel Russo
Benjamin Van Roy
ArXiv (abs)PDFHTML

Papers citing "(More) Efficient Reinforcement Learning via Posterior Sampling"

50 / 316 papers shown
Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems
Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems
Utkarsh U. Chavan
Prashant Trivedi
N. Hemachandra
151
0
0
06 Nov 2025
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
Jasmine Bayrooti
Sattar Vakili
Amanda Prorok
Carl Henrik Ek
151
2
0
23 Oct 2025
The Confusing Instance Principle for Online Linear Quadratic Control
The Confusing Instance Principle for Online Linear Quadratic Control
Waris Radji
Odalric-Ambrym Maillard
OffRL
180
1
0
22 Oct 2025
Exploration via Feature Perturbation in Contextual Bandits
Exploration via Feature Perturbation in Contextual Bandits
Seouh-won Yi
Min-hwan Oh
AAML
243
0
0
20 Oct 2025
Demystifying the Mechanisms Behind Emergent Exploration in Goal-conditioned RL
Demystifying the Mechanisms Behind Emergent Exploration in Goal-conditioned RL
Mahsa Bastankhah
Grace Liu
Dilip Arumugam
Thomas L. Griffiths
Benjamin Eysenbach
147
1
0
15 Oct 2025
Bayesian Optimization for Dynamic Pricing and Learning
Bayesian Optimization for Dynamic Pricing and Learning
Anush Anand
Pranav Agrawal
Tejas Bodas
193
0
0
14 Oct 2025
Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits
Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits
Jiazheng Sun
Weixin Wang
Pan Xu
200
1
0
12 Oct 2025
UAMDP: Uncertainty-Aware Markov Decision Process for Risk-Constrained Reinforcement Learning from Probabilistic Forecasts
UAMDP: Uncertainty-Aware Markov Decision Process for Risk-Constrained Reinforcement Learning from Probabilistic Forecasts
Michal Koren
Or Peretz
Tai Dinh
Philip S. Yu
139
0
0
09 Oct 2025
Stochastic Path Planning in Correlated Obstacle Fields
Stochastic Path Planning in Correlated Obstacle Fields
Li Zhou
Elvan Ceyhan
259
0
0
23 Sep 2025
Safe and Near-Optimal Control with Online Dynamics Learning
Safe and Near-Optimal Control with Online Dynamics Learning
Manish Prajapat
Johannes Köhler
Melanie Zeilinger
Andreas Krause
164
0
0
20 Sep 2025
Online Bayesian Risk-Averse Reinforcement Learning
Online Bayesian Risk-Averse Reinforcement Learning
Yuhao Wang
Enlu Zhou
OffRL
286
0
0
17 Sep 2025
Outcome-based Exploration for LLM Reasoning
Outcome-based Exploration for LLM Reasoning
Yuda Song
Julia Kempe
Remi Munos
OffRLLRM
321
49
0
08 Sep 2025
Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning
Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning
Pascal R. van der Vaart
Neil Yorke-Smith
M. Spaan
BDLUQCV
213
0
0
29 Aug 2025
Divide, Discover, Deploy: Factorized Skill Learning with Symmetry and Style Priors
Divide, Discover, Deploy: Factorized Skill Learning with Symmetry and Style Priors
Rafael Cathomen
Mayank Mittal
Marin Vlastelica
Marco Hutter
213
2
0
27 Aug 2025
QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting
QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting
Nicole Cho
William Watson
Alec Koppel
Sumitra Ganesh
Manuela Veloso
AAML
214
0
0
22 Aug 2025
Convergent Reinforcement Learning Algorithms for Stochastic Shortest Path Problem
Convergent Reinforcement Learning Algorithms for Stochastic Shortest Path Problem
Soumyajit Guin
S. Bhatnagar
122
0
0
19 Aug 2025
Q-learning with Posterior Sampling
Q-learning with Posterior Sampling
Priyank Agrawal
Shipra Agrawal
Azmat Azati
OffRLGP
367
2
0
01 Jun 2025
Deep Actor-Critics with Tight Risk Certificates
Deep Actor-Critics with Tight Risk Certificates
Bahareh Tasdighi
Manuel Haussmann
Yi-Shan Wu
A. Masegosa
M. Kandemir
UQCV
543
0
0
26 May 2025
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
Jiayi Geng
Howard Chen
Dilip Arumugam
Thomas L. Griffiths
448
3
0
23 May 2025
Toward Efficient Exploration by Large Language Model Agents
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
473
12
0
29 Apr 2025
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
Yun Qu
Wenjie Wang
Yixiu Mao
Yiqin Lv
Xiangyang Ji
TTA
604
10
0
27 Apr 2025
Reinforcement Learning from Multi-level and Episodic Human Feedback
Reinforcement Learning from Multi-level and Episodic Human FeedbackConference on Learning for Dynamics & Control (L4DC), 2025
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
600
0
0
20 Apr 2025
Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models
Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models
J.S. van Hulst
W.P.M.H. Heemels
D.J. Antunes
OffRL
195
1
0
08 Apr 2025
Minimax Optimal Reinforcement Learning with Quasi-Optimism
Minimax Optimal Reinforcement Learning with Quasi-OptimismInternational Conference on Learning Representations (ICLR), 2025
Harin Lee
Min-hwan Oh
OffRL
420
2
0
02 Mar 2025
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Jianyu Xu
Qiuzhuang Sun
Yang Yang
Huadong Mo
Daoyi Dong
412
0
0
24 Feb 2025
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement LearningAsian Conference on Machine Learning (ACML), 2025
Siddharth Aravindan
Dixant Mittal
Wee Sun Lee
BDL
326
0
0
17 Jan 2025
Online MDP with Transition Prototypes: A Robust Adaptive Approach
Online MDP with Transition Prototypes: A Robust Adaptive Approach
Shuo Sun
Meng Qi
Z. Shen
344
0
0
18 Dec 2024
Uncertainty-based Offline Variational Bayesian Reinforcement Learning
  for Robustness under Diverse Data Corruptions
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data CorruptionsNeural Information Processing Systems (NeurIPS), 2024
Rui Yang
Jie Wang
Guoping Wu
Yangqiu Song
AAMLOffRL
462
9
0
01 Nov 2024
Demystifying Linear MDPs and Novel Dynamics Aggregation Framework
Demystifying Linear MDPs and Novel Dynamics Aggregation FrameworkInternational Conference on Learning Representations (ICLR), 2024
Joongkyu Lee
Min-hwan Oh
343
5
0
31 Oct 2024
Risk-Aware Decision Making in Restless Bandits: Theory and Algorithms for Planning and Learning
Risk-Aware Decision Making in Restless Bandits: Theory and Algorithms for Planning and Learning
Nima Akbarzadeh
Erick Delage
Yossiri Adulyasak
432
0
0
30 Oct 2024
Practical Bayesian Algorithm Execution via Posterior Sampling
Practical Bayesian Algorithm Execution via Posterior SamplingNeural Information Processing Systems (NeurIPS), 2024
Chu Xin Cheng
Raul Astudillo
Thomas Desautels
Yisong Yue
297
2
0
27 Oct 2024
Random Policy Enables In-Context Reinforcement Learning within Trust Horizons
Random Policy Enables In-Context Reinforcement Learning within Trust Horizons
Weiqin Chen
Santiago Paternain
OffRL
419
0
0
25 Oct 2024
EVOLvE: Evaluating and Optimizing LLMs For In-Context Exploration
EVOLvE: Evaluating and Optimizing LLMs For In-Context Exploration
Allen Nie
Yi Su
B. Chang
Jonathan N. Lee
Ed H. Chi
Quoc Le
Minmin Chen
305
14
0
08 Oct 2024
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson SamplingInternational Conference on Learning Representations (ICLR), 2024
Jasmine Bayrooti
Carl Henrik Ek
Amanda Prorok
525
4
0
07 Oct 2024
SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning
SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement LearningIEEE International Conference on Robotics and Automation (ICRA), 2024
Amogh Joshi
Adarsh Kosta
Kaushik Roy
OffRL
482
4
0
16 Sep 2024
Random Latent Exploration for Deep Reinforcement Learning
Random Latent Exploration for Deep Reinforcement Learning
Srinath Mahankali
Zhang-Wei Hong
Ayush Sekhari
Alexander Rakhlin
Pulkit Agrawal
722
8
0
18 Jul 2024
Optimistic Q-learning for average reward and episodic reinforcement learning
Optimistic Q-learning for average reward and episodic reinforcement learning
Priyank Agrawal
Shipra Agrawal
473
9
0
18 Jul 2024
Satisficing Exploration for Deep Reinforcement Learning
Satisficing Exploration for Deep Reinforcement Learning
Dilip Arumugam
Saurabh Kumar
Ramki Gummadi
Benjamin Van Roy
287
3
0
16 Jul 2024
Model-Free Active Exploration in Reinforcement Learning
Model-Free Active Exploration in Reinforcement Learning
Alessio Russo
Alexandre Proutiere
OffRL
394
6
0
30 Jun 2024
Beyond Optimism: Exploration With Partially Observable Rewards
Beyond Optimism: Exploration With Partially Observable Rewards
Simone Parisi
Alireza Kazemipour
Michael Bowling
OffRL
295
7
0
20 Jun 2024
More Efficient Randomized Exploration for Reinforcement Learning via
  Approximate Sampling
More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling
Haque Ishfaq
Yixin Tan
Yu Yang
Qingfeng Lan
Jianfeng Lu
A. Rupam Mahmood
Doina Precup
Pan Xu
216
12
0
18 Jun 2024
Reinforcement Learning and Regret Bounds for Admission Control
Reinforcement Learning and Regret Bounds for Admission ControlInternational Conference on Machine Learning (ICML), 2024
Lucas Weber
A. Busic
Jiamin Zhu
186
1
0
07 Jun 2024
Self-Exploring Language Models: Active Preference Elicitation for Online
  Alignment
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Shenao Zhang
Donghan Yu
Hiteshi Sharma
Ziyi Yang
Shuohang Wang
Hany Hassan
Zhaoran Wang
LRM
306
57
0
29 May 2024
Efficient Exploration in Average-Reward Constrained Reinforcement
  Learning: Achieving Near-Optimal Regret With Posterior Sampling
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior Sampling
Danil Provodin
M. Kaptein
Mykola Pechenizkiy
318
0
0
29 May 2024
Preparing for Black Swans: The Antifragility Imperative for Machine
  Learning
Preparing for Black Swans: The Antifragility Imperative for Machine Learning
Ming Jin
358
6
0
18 May 2024
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
Sequential Decision Making with Expert Demonstrations under Unobserved HeterogeneityNeural Information Processing Systems (NeurIPS), 2024
Vahid Balazadeh Meresht
Keertana Chidambaram
Viet Nguyen
Fahad Razak
Vasilis Syrgkanis
505
2
0
10 Apr 2024
Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning
Srinjoy Roy
Swagatam Das
358
0
0
31 Mar 2024
Prior-dependent analysis of posterior sampling reinforcement learning
  with function approximation
Prior-dependent analysis of posterior sampling reinforcement learning with function approximation
Yingru Li
Zhi-Quan Luo
232
0
0
17 Mar 2024
Function-space Parameterization of Neural Networks for Sequential
  Learning
Function-space Parameterization of Neural Networks for Sequential Learning
Aidan Scannell
Riccardo Mereu
Paul E. Chang
Ella Tamir
Joni Pajarinen
Arno Solin
BDL
289
6
0
16 Mar 2024
Model-Free Approximate Bayesian Learning for Large-Scale Conversion
  Funnel Optimization
Model-Free Approximate Bayesian Learning for Large-Scale Conversion Funnel OptimizationProduction and operations management (POM), 2024
Garud Iyengar
Raghav Singal
225
1
0
12 Jan 2024
1234567
Next
Page 1 of 7