Exploration-Exploitation in Constrained MDPs

4 March 2020

Papers citing "Exploration-Exploitation in Constrained MDPs"

50 / 110 papers shown

Title
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds Qian Zuo Fengxiang He 26 0 0 07 Apr 2025
Primal-Dual Sample Complexity Bounds for Constrained Markov Decision Processes with Multiple Constraints Max Buckley Konstantinos Papathanasiou Andreas Spanopoulos 50 0 0 09 Mar 2025
Provably Efficient RL for Linear MDPs under Instantaneous Safety Constraints in Non-Convex Feature Spaces Amirhossein Roknilamouki A. Ghosh Ming Shi Fatemeh Nourzad Eylem Ekici Ness B. Shroff 64 0 0 25 Feb 2025
Embedding Safety into RL: A New Take on Trust Region Methods Nikola Milosevic Johannes Müller Nico Scherf 25 1 0 05 Nov 2024
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning Yarden As Bhavya Sukhija Lenart Treven Carmelo Sferrazza Stelian Coros Andreas Krause 25 1 0 12 Oct 2024
Optimal Strong Regret and Violation in Constrained MDPs via Policy Optimization Francesco Emanuele Stradi Matteo Castiglioni A. Marchesi Nicola Gatti 18 1 0 03 Oct 2024
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form Toshinori Kitamura Tadashi Kozuno Wataru Kumagai Kenta Hoshino Y. Hosoe Kazumi Kasaura Masashi Hamaya Paavo Parmas Yutaka Matsuo 72 0 0 29 Aug 2024
Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs Washim Uddin Mondal Vaneet Aggarwal 41 1 0 21 Aug 2024
A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints Francesco Emanuele Stradi Filippo Cipriani Lorenzo Ciampiconi Marco Leonardi A. Rozza Nicola Gatti 30 0 0 08 Jul 2024
Distributionally Robust Constrained Reinforcement Learning under Strong Duality Zhengfei Zhang Kishan Panaganti Laixi Shi Yanan Sui Adam Wierman Yisong Yue OOD 39 3 0 22 Jun 2024
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning Subhojyoti Mukherjee Josiah P. Hanna Qiaomin Xie Robert Nowak 72 2 0 07 Jun 2024
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP Subhojyoti Mukherjee Josiah P. Hanna Robert Nowak OffRL 43 0 0 04 Jun 2024
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior Sampling Danil Provodin M. Kaptein Mykola Pechenizkiy 39 0 0 29 May 2024
A CMDP-within-online framework for Meta-Safe Reinforcement Learning Vanshaj Khattar Yuhao Ding Bilgehan Sel Javad Lavaei Ming Jin OffRL 32 12 0 26 May 2024
Constrained Reinforcement Learning Under Model Mismatch Zhongchang Sun Sihong He Fei Miao Shaofeng Zou 44 4 0 02 May 2024
Learning Control Barrier Functions and their application in Reinforcement Learning: A Survey Maeva Guerrier Hassan Fouad Giovanni Beltrame OffRL 37 1 0 22 Apr 2024
Structured Reinforcement Learning for Media Streaming at the Wireless Edge Archana Bura Sarat Chandra Bobbili Shreyas Rameshkumar Desik Rengarajan D. Kalathil S. Shakkottai 26 0 0 10 Apr 2024
Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time Abhijit Mazumdar Rafał Wisniewski Manuela L. Bujorianu 18 3 0 23 Mar 2024
Learning Adversarial MDPs with Stochastic Hard Constraints Francesco Emanuele Stradi Matteo Castiglioni A. Marchesi Nicola Gatti 26 4 0 06 Mar 2024
Truly No-Regret Learning in Constrained MDPs Adrian Müller Pragnya Alatur V. Cevher Giorgia Ramponi Niao He 32 7 0 24 Feb 2024
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning Zihao Li Boyi Liu Zhuoran Yang Zhaoran Wang Mengdi Wang 42 1 0 16 Feb 2024
Markov Persuasion Processes: Learning to Persuade from Scratch Francesco Bacchiocchi Francesco Emanuele Stradi Matteo Castiglioni A. Marchesi Nicola Gatti 28 7 0 05 Feb 2024
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees Toshinori Kitamura Tadashi Kozuno Masahiro Kato Yuki Ichihara Soichiro Nishimori Akiyoshi Sannai Sho Sonoda Wataru Kumagai Yutaka Matsuo 42 2 0 31 Jan 2024
Resilient Constrained Reinforcement Learning Dongsheng Ding Zhengyan Huan Alejandro Ribeiro 16 1 0 28 Dec 2023
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation Paul Daoudi Mathias Formoso Othman Gaizi Achraf Azize Evrard Garcelon OffRL 23 0 0 24 Dec 2023
Safe Reinforcement Learning with Instantaneous Constraints: The Role of Aggressive Exploration Honghao Wei Xin Liu Lei Ying 30 1 0 22 Dec 2023
Online Restless Multi-Armed Bandits with Long-Term Fairness Constraints Shu-Fan Wang Guojun Xiong Jian Li 51 6 0 16 Dec 2023
Anytime-Constrained Reinforcement Learning Jeremy McMahan Xiaojin Zhu 28 5 0 09 Nov 2023
Anytime-Competitive Reinforcement Learning with Policy Prior Jianyi Yang Pengfei Li Tongxin Li Adam Wierman Shaolei Ren 40 2 0 02 Nov 2023
Confronting Reward Model Overoptimization with Constrained RLHF Ted Moskovitz Aaditya K. Singh DJ Strouse T. Sandholm Ruslan Salakhutdinov Anca D. Dragan Stephen Marcus McAleer 34 47 0 06 Oct 2023
Provably Efficient Exploration in Constrained Reinforcement Learning:Posterior Sampling Is All You Need Danil Provodin Pratik Gajane Mykola Pechenizkiy M. Kaptein 31 0 0 27 Sep 2023
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs Zihan Zhou Honghao Wei Lei Ying OffRL 40 1 0 27 Sep 2023
Active Coverage for PAC Reinforcement Learning Aymen Al Marjani Andrea Tirinzoni E. Kaufmann OffRL 21 4 0 23 Jun 2023
Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs Dongsheng Ding Chen-Yu Wei Kaipeng Zhang Alejandro Ribeiro 40 19 0 20 Jun 2023
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning Kihyuk Hong Yuhang Li Ambuj Tewari OffRL 18 7 0 13 Jun 2023
Provably Learning Nash Policies in Constrained Markov Potential Games Pragnya Alatur Giorgia Ramponi Niao He Andreas Krause 24 10 0 13 Jun 2023
Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes A. Müller Pragnya Alatur Giorgia Ramponi Niao He 23 5 0 12 Jun 2023
Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints Donghao Li Ruiquan Huang Cong Shen Jing Yang 24 3 0 09 Jun 2023
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space Anas Barakat Ilyas Fatkhullin Niao He 26 11 0 02 Jun 2023
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning Peizhong Ju A. Ghosh Ness B. Shroff 30 4 0 01 Jun 2023
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning Dongsheng Ding Xiaohan Wei Zhuoran Yang Zhaoran Wang Mihailo R. Jovanović OffRL 32 11 0 31 May 2023
Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities Donghao Ying Yunkai Zhang Yuhao Ding Alec Koppel Javad Lavaei 33 11 0 27 May 2023
Online Resource Allocation in Episodic Markov Decision Processes Duksang Lee William Overman Dabeen Lee 37 1 0 18 May 2023
Semi-Infinitely Constrained Markov Decision Processes and Efficient Reinforcement Learning Liangyu Zhang Yang Peng Wenhao Yang Zhihua Zhang 15 1 0 29 Apr 2023
Long-Term Fairness with Unknown Dynamics Tongxin Yin Reilly P. Raab M. Liu Yang Liu FaML 15 24 0 19 Apr 2023
Provably Efficient Model-Free Algorithms for Non-stationary CMDPs Honghao Wei A. Ghosh Ness B. Shroff Lei Ying Xingyu Zhou 11 13 0 10 Mar 2023
On Bellman's principle of optimality and Reinforcement learning for safety-constrained Markov decision process Rahul Misra Rafal Wisniewski C. Kallesøe 46 0 0 25 Feb 2023
Provably Safe Reinforcement Learning with Step-wise Violation Constraints Nuoya Xiong Yihan Du Longbo Huang 15 9 0 13 Feb 2023
A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints Ming Shi Yitao Liang Ness B. Shroff 33 8 0 08 Feb 2023
Adaptive Aggregation for Safety-Critical Control Huiliang Zhang Di Wu Benoit Boulet 16 0 0 07 Feb 2023