Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.02189
Cited By
Exploration-Exploitation in Constrained MDPs
4 March 2020
Yonathan Efroni
Shie Mannor
Matteo Pirotta
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploration-Exploitation in Constrained MDPs"
50 / 110 papers shown
Title
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
Qian Zuo
Fengxiang He
26
0
0
07 Apr 2025
Primal-Dual Sample Complexity Bounds for Constrained Markov Decision Processes with Multiple Constraints
Max Buckley
Konstantinos Papathanasiou
Andreas Spanopoulos
50
0
0
09 Mar 2025
Provably Efficient RL for Linear MDPs under Instantaneous Safety Constraints in Non-Convex Feature Spaces
Amirhossein Roknilamouki
A. Ghosh
Ming Shi
Fatemeh Nourzad
Eylem Ekici
Ness B. Shroff
64
0
0
25 Feb 2025
Embedding Safety into RL: A New Take on Trust Region Methods
Nikola Milosevic
Johannes Müller
Nico Scherf
25
1
0
05 Nov 2024
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
Yarden As
Bhavya Sukhija
Lenart Treven
Carmelo Sferrazza
Stelian Coros
Andreas Krause
25
1
0
12 Oct 2024
Optimal Strong Regret and Violation in Constrained MDPs via Policy Optimization
Francesco Emanuele Stradi
Matteo Castiglioni
A. Marchesi
Nicola Gatti
18
1
0
03 Oct 2024
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Toshinori Kitamura
Tadashi Kozuno
Wataru Kumagai
Kenta Hoshino
Y. Hosoe
Kazumi Kasaura
Masashi Hamaya
Paavo Parmas
Yutaka Matsuo
72
0
0
29 Aug 2024
Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs
Washim Uddin Mondal
Vaneet Aggarwal
41
1
0
21 Aug 2024
A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints
Francesco Emanuele Stradi
Filippo Cipriani
Lorenzo Ciampiconi
Marco Leonardi
A. Rozza
Nicola Gatti
30
0
0
08 Jul 2024
Distributionally Robust Constrained Reinforcement Learning under Strong Duality
Zhengfei Zhang
Kishan Panaganti
Laixi Shi
Yanan Sui
Adam Wierman
Yisong Yue
OOD
39
3
0
22 Jun 2024
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert Nowak
72
2
0
07 Jun 2024
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP
Subhojyoti Mukherjee
Josiah P. Hanna
Robert Nowak
OffRL
43
0
0
04 Jun 2024
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior Sampling
Danil Provodin
M. Kaptein
Mykola Pechenizkiy
39
0
0
29 May 2024
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
Vanshaj Khattar
Yuhao Ding
Bilgehan Sel
Javad Lavaei
Ming Jin
OffRL
32
12
0
26 May 2024
Constrained Reinforcement Learning Under Model Mismatch
Zhongchang Sun
Sihong He
Fei Miao
Shaofeng Zou
44
4
0
02 May 2024
Learning Control Barrier Functions and their application in Reinforcement Learning: A Survey
Maeva Guerrier
Hassan Fouad
Giovanni Beltrame
OffRL
37
1
0
22 Apr 2024
Structured Reinforcement Learning for Media Streaming at the Wireless Edge
Archana Bura
Sarat Chandra Bobbili
Shreyas Rameshkumar
Desik Rengarajan
D. Kalathil
S. Shakkottai
26
0
0
10 Apr 2024
Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time
Abhijit Mazumdar
Rafał Wisniewski
Manuela L. Bujorianu
18
3
0
23 Mar 2024
Learning Adversarial MDPs with Stochastic Hard Constraints
Francesco Emanuele Stradi
Matteo Castiglioni
A. Marchesi
Nicola Gatti
26
4
0
06 Mar 2024
Truly No-Regret Learning in Constrained MDPs
Adrian Müller
Pragnya Alatur
V. Cevher
Giorgia Ramponi
Niao He
32
7
0
24 Feb 2024
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
Zihao Li
Boyi Liu
Zhuoran Yang
Zhaoran Wang
Mengdi Wang
42
1
0
16 Feb 2024
Markov Persuasion Processes: Learning to Persuade from Scratch
Francesco Bacchiocchi
Francesco Emanuele Stradi
Matteo Castiglioni
A. Marchesi
Nicola Gatti
28
7
0
05 Feb 2024
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Toshinori Kitamura
Tadashi Kozuno
Masahiro Kato
Yuki Ichihara
Soichiro Nishimori
Akiyoshi Sannai
Sho Sonoda
Wataru Kumagai
Yutaka Matsuo
42
2
0
31 Jan 2024
Resilient Constrained Reinforcement Learning
Dongsheng Ding
Zhengyan Huan
Alejandro Ribeiro
16
1
0
28 Dec 2023
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
23
0
0
24 Dec 2023
Safe Reinforcement Learning with Instantaneous Constraints: The Role of Aggressive Exploration
Honghao Wei
Xin Liu
Lei Ying
30
1
0
22 Dec 2023
Online Restless Multi-Armed Bandits with Long-Term Fairness Constraints
Shu-Fan Wang
Guojun Xiong
Jian Li
51
6
0
16 Dec 2023
Anytime-Constrained Reinforcement Learning
Jeremy McMahan
Xiaojin Zhu
28
5
0
09 Nov 2023
Anytime-Competitive Reinforcement Learning with Policy Prior
Jianyi Yang
Pengfei Li
Tongxin Li
Adam Wierman
Shaolei Ren
40
2
0
02 Nov 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
34
47
0
06 Oct 2023
Provably Efficient Exploration in Constrained Reinforcement Learning:Posterior Sampling Is All You Need
Danil Provodin
Pratik Gajane
Mykola Pechenizkiy
M. Kaptein
31
0
0
27 Sep 2023
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
Zihan Zhou
Honghao Wei
Lei Ying
OffRL
37
1
0
27 Sep 2023
Active Coverage for PAC Reinforcement Learning
Aymen Al Marjani
Andrea Tirinzoni
E. Kaufmann
OffRL
21
4
0
23 Jun 2023
Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs
Dongsheng Ding
Chen-Yu Wei
Kaipeng Zhang
Alejandro Ribeiro
38
19
0
20 Jun 2023
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning
Kihyuk Hong
Yuhang Li
Ambuj Tewari
OffRL
18
7
0
13 Jun 2023
Provably Learning Nash Policies in Constrained Markov Potential Games
Pragnya Alatur
Giorgia Ramponi
Niao He
Andreas Krause
24
10
0
13 Jun 2023
Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes
A. Müller
Pragnya Alatur
Giorgia Ramponi
Niao He
23
5
0
12 Jun 2023
Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints
Donghao Li
Ruiquan Huang
Cong Shen
Jing Yang
24
3
0
09 Jun 2023
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
Anas Barakat
Ilyas Fatkhullin
Niao He
26
11
0
02 Jun 2023
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning
Peizhong Ju
A. Ghosh
Ness B. Shroff
30
4
0
01 Jun 2023
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
Mihailo R. Jovanović
OffRL
32
11
0
31 May 2023
Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities
Donghao Ying
Yunkai Zhang
Yuhao Ding
Alec Koppel
Javad Lavaei
33
11
0
27 May 2023
Online Resource Allocation in Episodic Markov Decision Processes
Duksang Lee
William Overman
Dabeen Lee
37
1
0
18 May 2023
Semi-Infinitely Constrained Markov Decision Processes and Efficient Reinforcement Learning
Liangyu Zhang
Yang Peng
Wenhao Yang
Zhihua Zhang
15
1
0
29 Apr 2023
Long-Term Fairness with Unknown Dynamics
Tongxin Yin
Reilly P. Raab
M. Liu
Yang Liu
FaML
15
24
0
19 Apr 2023
Provably Efficient Model-Free Algorithms for Non-stationary CMDPs
Honghao Wei
A. Ghosh
Ness B. Shroff
Lei Ying
Xingyu Zhou
11
13
0
10 Mar 2023
On Bellman's principle of optimality and Reinforcement learning for safety-constrained Markov decision process
Rahul Misra
Rafal Wisniewski
C. Kallesøe
46
0
0
25 Feb 2023
Provably Safe Reinforcement Learning with Step-wise Violation Constraints
Nuoya Xiong
Yihan Du
Longbo Huang
15
9
0
13 Feb 2023
A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints
Ming Shi
Yitao Liang
Ness B. Shroff
33
8
0
08 Feb 2023
Adaptive Aggregation for Safety-Critical Control
Huiliang Zhang
Di Wu
Benoit Boulet
16
0
0
07 Feb 2023
1
2
3
Next