Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2003.14089
Cited By
v1
v2
v3
v4
v5 (latest)
Leverage the Average: an Analysis of KL Regularization in RL
31 March 2020
Nino Vieillard
Tadashi Kozuno
B. Scherrer
Olivier Pietquin
Rémi Munos
Matthieu Geist
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"Leverage the Average: an Analysis of KL Regularization in RL"
32 / 32 papers shown
Symmetric Behavior Regularized Policy Optimization
Lingwei Zhu
Zheng Chen
Zheng Chen
Yukie Nagai
Martha White
OffRL
254
0
0
06 Aug 2025
Dual Approximation Policy Optimization
Zhihan Xiong
Maryam Fazel
Lin Xiao
286
1
0
02 Oct 2024
q-exponential family for policy optimization
International Conference on Learning Representations (ICLR), 2024
Lingwei Zhu
Haseeb Shah
Zheng Chen
Yukie Nagai
Martha White
OffRL
561
3
0
14 Aug 2024
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Cassidy Laidlaw
Shivam Singhal
Anca Dragan
AAML
489
11
0
05 Mar 2024
Guaranteed Trust Region Optimization via Two-Phase KL Penalization
K.R. Zentner
Ujjwal Puri
Zhehui Huang
Gaurav Sukhatme
OffRL
253
0
0
08 Dec 2023
Acceleration in Policy Optimization
Veronica Chelu
Tom Zahavy
A. Guez
Doina Precup
Sebastian Flennerhag
356
0
0
18 Jun 2023
Towards Minimax Optimality of Model-based Robust Reinforcement Learning
Conference on Uncertainty in Artificial Intelligence (UAI), 2023
Pierre Clavier
E. L. Pennec
Matthieu Geist
479
20
0
10 Feb 2023
Generalized Munchausen Reinforcement Learning using Tsallis KL Divergence
Neural Information Processing Systems (NeurIPS), 2023
Lingwei Zhu
Zheng Chen
Takamitsu Matsubara
Martha White
312
3
0
27 Jan 2023
Extreme Q-Learning: MaxEnt RL without Entropy
International Conference on Learning Representations (ICLR), 2023
Divyansh Garg
Joey Hejna
Matthieu Geist
Stefano Ermon
OffRL
325
114
0
05 Jan 2023
Latent State Marginalization as a Low-cost Approach for Improving Exploration
International Conference on Learning Representations (ICLR), 2022
Dinghuai Zhang
Aaron Courville
Yoshua Bengio
Qinqing Zheng
Amy Zhang
Ricky T. Q. Chen
OOD
352
12
0
03 Oct 2022
q
q
q
-Munchausen Reinforcement Learning
Lingwei Zhu
Zheng Chen
E. Uchibe
Takamitsu Matsubara
OffRL
159
0
0
16 May 2022
Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning
Lingwei Zhu
Zheng Chen
E. Uchibe
Takamitsu Matsubara
130
1
0
16 May 2022
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Alexis Jacq
Johan Ferret
Olivier Pietquin
Matthieu Geist
219
11
0
16 Mar 2022
Do You Need the Entropy Reward (in Practice)?
Haonan Yu
Haichao Zhang
Wei Xu
268
12
0
28 Jan 2022
Actor Loss of Soft Actor Critic Explained
Thibault Lahire
130
2
0
31 Dec 2021
Error Controlled Actor-Critic
Information Sciences (Inf. Sci.), 2021
Xingen Gao
Jiayi Ji
Changle Zhou
Zhen Ge
Chih-Min Lin
Longzhi Yang
Xiang Chang
C. Shang
125
4
0
06 Sep 2021
Implicitly Regularized RL with Implicit Q-Values
Nino Vieillard
Marcin Andrychowicz
Anton Raichuk
Olivier Pietquin
Matthieu Geist
OffRL
234
9
0
16 Aug 2021
Cautious Policy Programming: Exploiting KL Regularization in Monotonic Policy Improvement for Reinforcement Learning
Lingwei Zhu
Toshinori Kitamura
Takamitsu Matsubara
OffRL
209
1
0
13 Jul 2021
Bayesian Bellman Operators
Neural Information Processing Systems (NeurIPS), 2021
M. Fellows
Kristian Hartikainen
Shimon Whiteson
OffRL
260
18
0
09 Jun 2021
Muesli: Combining Improvements in Policy Optimization
International Conference on Machine Learning (ICML), 2021
Matteo Hessel
Ivo Danihelka
Fabio Viola
A. Guez
Simon Schmitt
Laurent Sifre
T. Weber
David Silver
H. V. Hasselt
314
69
0
13 Apr 2021
Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2021
Hiroki Furuta
Tadashi Kozuno
T. Matsushima
Y. Matsuo
S. Gu
414
14
0
31 Mar 2021
Near Optimal Policy Optimization via REPS
Neural Information Processing Systems (NeurIPS), 2021
Aldo Pacchiano
Jonathan Lee
Peter L. Bartlett
Ofir Nachum
259
3
0
17 Mar 2021
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
International Conference on Learning Representations (ICLR), 2021
Benjamin Eysenbach
Sergey Levine
OOD
342
241
0
10 Mar 2021
Improved Regret Bound and Experience Replay in Regularized Policy Iteration
International Conference on Machine Learning (ICML), 2021
N. Lazić
Dong Yin
Yasin Abbasi-Yadkori
Csaba Szepesvári
OffRL
158
20
0
25 Feb 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
142
15
0
11 Feb 2021
Adversarially Guided Actor-Critic
International Conference on Learning Representations (ICLR), 2021
Yannis Flet-Berliac
Johan Ferret
Olivier Pietquin
Philippe Preux
Matthieu Geist
228
78
0
08 Feb 2021
Reinforcement Learning Control of a Biomechanical Model of the Upper Extremity
Scientific Reports (Sci Rep), 2020
F. Fischer
Miroslav Bachinski
Markus Klar
A. Fleig
Jorg Muller
213
58
0
13 Nov 2020
Finding the Near Optimal Policy via Adaptive Reduced Regularization in MDPs
Wenhao Yang
Xiang Li
Guangzeng Xie
Zhihua Zhang
232
5
0
31 Oct 2020
Logistic Q-Learning
Joan Bas-Serrano
Sebastian Curi
Andreas Krause
Gergely Neu
497
43
0
21 Oct 2020
Learning Off-Policy with Online Planning
Harshit S. Sikchi
Wenxuan Zhou
David Held
OffRL
653
65
0
23 Aug 2020
Munchausen Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2020
Nino Vieillard
Olivier Pietquin
Matthieu Geist
OffRL
272
107
0
28 Jul 2020
Discount Factor as a Regularizer in Reinforcement Learning
Ron Amit
Ron Meir
K. Ciosek
OffRL
277
85
0
04 Jul 2020
1
Page 1 of 1