ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07798
  4. Cited By
A unified view of entropy-regularized Markov decision processes

A unified view of entropy-regularized Markov decision processes

22 May 2017
Gergely Neu
Anders Jonsson
Vicencc Gómez
ArXivPDFHTML

Papers citing "A unified view of entropy-regularized Markov decision processes"

50 / 77 papers shown
Title
A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance
A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance
Axel Friedrich Wolter
Tobias Sutter
OffRL
37
0
0
07 May 2025
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Yongtao Wu
Luca Viano
Yihang Chen
Zhenyu Zhu
Kimon Antonakopoulos
Quanquan Gu
V. Cevher
58
0
0
18 Feb 2025
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning
Rémy Hosseinkhan Boucher
Onofrio Semeraro
L. Mathelin
82
0
0
28 Jan 2025
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
47
16
0
28 Jan 2025
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
57
3
0
07 Nov 2024
Embedding Safety into RL: A New Take on Trust Region Methods
Embedding Safety into RL: A New Take on Trust Region Methods
Nikola Milosevic
Johannes Müller
Nico Scherf
25
1
0
05 Nov 2024
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Timofei Gritsaev
Nikita Morozov
S. Samsonov
D. Tiapkin
21
0
0
20 Oct 2024
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
Bram De Cooman
Johan A. K. Suykens
38
0
0
25 Apr 2024
Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization
Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization
Daniel Jarne Ornia
Giannis Delimpaltadakis
Jens Kober
Javier Alonso-Mora
28
2
0
30 Nov 2023
A Large Deviations Perspective on Policy Gradient Algorithms
A Large Deviations Perspective on Policy Gradient Algorithms
Wouter Jongeneel
Daniel Kuhn
Mengmeng Li
31
1
0
13 Nov 2023
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Chanwoo Park
Kaipeng Zhang
Asuman Ozdaglar
32
8
0
13 Jul 2023
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning
Outongyi Lv
Bingxin Zhou
OffRL
44
0
0
05 Jul 2023
Identifiability and Generalizability in Constrained Inverse
  Reinforcement Learning
Identifiability and Generalizability in Constrained Inverse Reinforcement Learning
Andreas Schlaginhaufen
Maryam Kamgarpour
29
10
0
01 Jun 2023
Offline Primal-Dual Reinforcement Learning for Linear MDPs
Offline Primal-Dual Reinforcement Learning for Linear MDPs
Germano Gabbianelli
Gergely Neu
Nneka Okolo
Matteo Papini
OffRL
29
7
0
22 May 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value
  Regularization
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
36
73
0
28 Mar 2023
Fast Rates for Maximum Entropy Exploration
Fast Rates for Maximum Entropy Exploration
D. Tiapkin
Denis Belomestny
Daniele Calandriello
Eric Moulines
Rémi Munos
A. Naumov
Pierre Perrault
Yunhao Tang
Michal Valko
Pierre Menard
44
18
0
14 Mar 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
50
5
0
05 Feb 2023
A general Markov decision process formalism for action-state
  entropy-regularized reward maximization
A general Markov decision process formalism for action-state entropy-regularized reward maximization
D. Grytskyy
Jorge Ramírez-Ruiz
R. Moreno-Bote
22
3
0
02 Feb 2023
Fast Computation of Optimal Transport via Entropy-Regularized
  Extragradient Methods
Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods
Gen Li
Yanxi Chen
Yu Huang
Yuejie Chi
H. Vincent Poor
Yuxin Chen
OT
46
5
0
30 Jan 2023
Efficient Global Planning in Large MDPs via Stochastic Primal-Dual
  Optimization
Efficient Global Planning in Large MDPs via Stochastic Primal-Dual Optimization
Gergely Neu
Nneka Okolo
37
6
0
21 Oct 2022
RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
J. Hau
Marek Petrik
Mohammad Ghavamzadeh
R. Russel
32
5
0
09 Sep 2022
Variational Inference for Model-Free and Model-Based Reinforcement
  Learning
Variational Inference for Model-Free and Model-Based Reinforcement Learning
Felix Leibfried
OffRL
23
0
0
04 Sep 2022
Entropy Augmented Reinforcement Learning
Entropy Augmented Reinforcement Learning
Jianfei Ma
36
0
0
19 Aug 2022
Bayesian regularization of empirical MDPs
Bayesian regularization of empirical MDPs
Samarth Gupta
Daniel N. Hill
Lexing Ying
Inderjit Dhillon
OffRL
29
0
0
03 Aug 2022
Performative Reinforcement Learning
Performative Reinforcement Learning
Debmalya Mandal
Stelios Triantafyllou
Goran Radanović
36
18
0
30 Jun 2022
Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games
Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games
Sihan Zeng
Thinh T. Doan
Justin Romberg
73
22
0
27 May 2022
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When
  to Act
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Alexis Jacq
Johan Ferret
Olivier Pietquin
M. Geist
32
9
0
16 Mar 2022
Accelerating Primal-dual Methods for Regularized Markov Decision
  Processes
Accelerating Primal-dual Methods for Regularized Markov Decision Processes
Haoya Li
Hsiang-Fu Yu
Lexing Ying
Inderjit Dhillon
34
4
0
21 Feb 2022
Adversarially Trained Actor Critic for Offline Reinforcement Learning
Adversarially Trained Actor Critic for Offline Reinforcement Learning
Ching-An Cheng
Tengyang Xie
Nan Jiang
Alekh Agarwal
OffRL
16
127
0
05 Feb 2022
You May Not Need Ratio Clipping in PPO
You May Not Need Ratio Clipping in PPO
Mingfei Sun
Vitaly Kurin
Guoqing Liu
Sam Devlin
Tao Qin
Katja Hofmann
Shimon Whiteson
13
15
0
31 Jan 2022
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural
  Network Approximation in the Mean-Field Regime
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
33
11
0
18 Jan 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
29
24
0
07 Jan 2022
Approximate Newton policy gradient algorithms
Approximate Newton policy gradient algorithms
Haoya Li
Samarth Gupta
Hsiangfu Yu
Lexing Ying
Inderjit Dhillon
51
2
0
05 Oct 2021
Batch size-invariance for policy optimization
Batch size-invariance for policy optimization
Jacob Hilton
K. Cobbe
John Schulman
17
11
0
01 Oct 2021
Divergence-Regularized Multi-Agent Actor-Critic
Divergence-Regularized Multi-Agent Actor-Critic
Kefan Su
Zongqing Lu
46
25
0
01 Oct 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic
  Reinforcement Learning and Global Convergence of Policy Gradient Methods
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
31
6
0
13 Sep 2021
A Survey of Exploration Methods in Reinforcement Learning
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
34
80
0
01 Sep 2021
Greedification Operators for Policy Optimization: Investigating Forward
  and Reverse KL Divergences
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
25
29
0
17 Jul 2021
Policy Mirror Descent for Regularized Reinforcement Learning: A
  Generalized Framework with Linear Convergence
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
Wenhao Zhan
Shicong Cen
Baihe Huang
Yuxin Chen
Jason D. Lee
Yuejie Chi
21
76
0
24 May 2021
The Confluence of Networks, Games and Learning
The Confluence of Networks, Games and Learning
Tao Li
Guanze Peng
Quanyan Zhu
Tamer Basar
AI4CE
13
46
0
17 May 2021
Reinforcement learning of rare diffusive dynamics
Reinforcement learning of rare diffusive dynamics
Avishek Das
Dominic C. Rose
J. P. Garrahan
David T. Limmer
24
27
0
10 May 2021
Safe Chance Constrained Reinforcement Learning for Batch Process Control
Safe Chance Constrained Reinforcement Learning for Batch Process Control
M. Mowbray
Panagiotis Petsagkourakis
Ehecatl Antonio del Rio Chanona
Dongda Zhang
OffRL
37
34
0
23 Apr 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear
  Function Approximation
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
32
52
0
24 Mar 2021
Near Optimal Policy Optimization via REPS
Near Optimal Policy Optimization via REPS
Aldo Pacchiano
Jonathan Lee
Peter L. Bartlett
Ofir Nachum
23
3
0
17 Mar 2021
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
Benjamin Eysenbach
Sergey Levine
OOD
50
175
0
10 Mar 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial
  Linear Mixture MDPs
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
95
24
0
17 Feb 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration
Optimization Issues in KL-Constrained Approximate Policy Iteration
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
19
10
0
11 Feb 2021
Approximately Solving Mean Field Games via Entropy-Regularized Deep
  Reinforcement Learning
Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning
Kai Cui
Heinz Koeppl
69
91
0
02 Feb 2021
A Tutorial on Sparse Gaussian Processes and Variational Inference
A Tutorial on Sparse Gaussian Processes and Variational Inference
Felix Leibfried
Vincent Dutordoir
S. T. John
N. Durrande
GP
42
49
0
27 Dec 2020
Sample Efficient Reinforcement Learning with REINFORCE
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
42
101
0
22 Oct 2020
12
Next