Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07798
Cited By
A unified view of entropy-regularized Markov decision processes
22 May 2017
Gergely Neu
Anders Jonsson
Vicencc Gómez
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A unified view of entropy-regularized Markov decision processes"
50 / 76 papers shown
Title
A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance
Axel Friedrich Wolter
Tobias Sutter
OffRL
37
0
0
07 May 2025
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Yongtao Wu
Luca Viano
Yihang Chen
Zhenyu Zhu
Kimon Antonakopoulos
Quanquan Gu
V. Cevher
56
0
0
18 Feb 2025
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning
Rémy Hosseinkhan Boucher
Onofrio Semeraro
L. Mathelin
82
0
0
28 Jan 2025
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
47
16
0
28 Jan 2025
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
57
3
0
07 Nov 2024
Embedding Safety into RL: A New Take on Trust Region Methods
Nikola Milosevic
Johannes Müller
Nico Scherf
25
1
0
05 Nov 2024
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Timofei Gritsaev
Nikita Morozov
S. Samsonov
D. Tiapkin
21
0
0
20 Oct 2024
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
Bram De Cooman
Johan A. K. Suykens
38
0
0
25 Apr 2024
Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization
Daniel Jarne Ornia
Giannis Delimpaltadakis
Jens Kober
Javier Alonso-Mora
28
2
0
30 Nov 2023
A Large Deviations Perspective on Policy Gradient Algorithms
Wouter Jongeneel
Daniel Kuhn
Mengmeng Li
31
1
0
13 Nov 2023
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Chanwoo Park
Kaipeng Zhang
Asuman Ozdaglar
30
8
0
13 Jul 2023
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning
Outongyi Lv
Bingxin Zhou
OffRL
44
0
0
05 Jul 2023
Identifiability and Generalizability in Constrained Inverse Reinforcement Learning
Andreas Schlaginhaufen
Maryam Kamgarpour
29
10
0
01 Jun 2023
Offline Primal-Dual Reinforcement Learning for Linear MDPs
Germano Gabbianelli
Gergely Neu
Nneka Okolo
Matteo Papini
OffRL
29
7
0
22 May 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
36
73
0
28 Mar 2023
Fast Rates for Maximum Entropy Exploration
D. Tiapkin
Denis Belomestny
Daniele Calandriello
Eric Moulines
Rémi Munos
A. Naumov
Pierre Perrault
Yunhao Tang
Michal Valko
Pierre Menard
44
18
0
14 Mar 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
50
5
0
05 Feb 2023
A general Markov decision process formalism for action-state entropy-regularized reward maximization
D. Grytskyy
Jorge Ramírez-Ruiz
R. Moreno-Bote
22
3
0
02 Feb 2023
Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods
Gen Li
Yanxi Chen
Yu Huang
Yuejie Chi
H. Vincent Poor
Yuxin Chen
OT
46
5
0
30 Jan 2023
Efficient Global Planning in Large MDPs via Stochastic Primal-Dual Optimization
Gergely Neu
Nneka Okolo
34
6
0
21 Oct 2022
RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
J. Hau
Marek Petrik
Mohammad Ghavamzadeh
R. Russel
32
5
0
09 Sep 2022
Variational Inference for Model-Free and Model-Based Reinforcement Learning
Felix Leibfried
OffRL
23
0
0
04 Sep 2022
Entropy Augmented Reinforcement Learning
Jianfei Ma
36
0
0
19 Aug 2022
Bayesian regularization of empirical MDPs
Samarth Gupta
Daniel N. Hill
Lexing Ying
Inderjit Dhillon
OffRL
29
0
0
03 Aug 2022
Performative Reinforcement Learning
Debmalya Mandal
Stelios Triantafyllou
Goran Radanović
33
18
0
30 Jun 2022
Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games
Sihan Zeng
Thinh T. Doan
Justin Romberg
71
22
0
27 May 2022
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Alexis Jacq
Johan Ferret
Olivier Pietquin
M. Geist
32
9
0
16 Mar 2022
Accelerating Primal-dual Methods for Regularized Markov Decision Processes
Haoya Li
Hsiang-Fu Yu
Lexing Ying
Inderjit Dhillon
34
4
0
21 Feb 2022
Adversarially Trained Actor Critic for Offline Reinforcement Learning
Ching-An Cheng
Tengyang Xie
Nan Jiang
Alekh Agarwal
OffRL
16
127
0
05 Feb 2022
You May Not Need Ratio Clipping in PPO
Mingfei Sun
Vitaly Kurin
Guoqing Liu
Sam Devlin
Tao Qin
Katja Hofmann
Shimon Whiteson
13
15
0
31 Jan 2022
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
30
11
0
18 Jan 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
26
24
0
07 Jan 2022
Approximate Newton policy gradient algorithms
Haoya Li
Samarth Gupta
Hsiangfu Yu
Lexing Ying
Inderjit Dhillon
51
2
0
05 Oct 2021
Batch size-invariance for policy optimization
Jacob Hilton
K. Cobbe
John Schulman
17
11
0
01 Oct 2021
Divergence-Regularized Multi-Agent Actor-Critic
Kefan Su
Zongqing Lu
46
25
0
01 Oct 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
28
6
0
13 Sep 2021
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
32
80
0
01 Sep 2021
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
25
29
0
17 Jul 2021
The Confluence of Networks, Games and Learning
Tao Li
Guanze Peng
Quanyan Zhu
Tamer Basar
AI4CE
8
46
0
17 May 2021
Reinforcement learning of rare diffusive dynamics
Avishek Das
Dominic C. Rose
J. P. Garrahan
David T. Limmer
24
27
0
10 May 2021
Safe Chance Constrained Reinforcement Learning for Batch Process Control
M. Mowbray
Panagiotis Petsagkourakis
Ehecatl Antonio del Rio Chanona
Dongda Zhang
OffRL
34
34
0
23 Apr 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
32
52
0
24 Mar 2021
Near Optimal Policy Optimization via REPS
Aldo Pacchiano
Jonathan Lee
Peter L. Bartlett
Ofir Nachum
23
3
0
17 Mar 2021
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
Benjamin Eysenbach
Sergey Levine
OOD
50
175
0
10 Mar 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
95
24
0
17 Feb 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
19
10
0
11 Feb 2021
Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning
Kai Cui
Heinz Koeppl
64
91
0
02 Feb 2021
A Tutorial on Sparse Gaussian Processes and Variational Inference
Felix Leibfried
Vincent Dutordoir
S. T. John
N. Durrande
GP
42
49
0
27 Dec 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
42
101
0
22 Oct 2020
Logistic Q-Learning
Joan Bas-Serrano
Sebastian Curi
Andreas Krause
Gergely Neu
14
40
0
21 Oct 2020
1
2
Next