Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.11214
Cited By
Understanding the impact of entropy on policy optimization
27 November 2018
Zafarali Ahmed
Nicolas Le Roux
Mohammad Norouzi
Dale Schuurmans
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Understanding the impact of entropy on policy optimization"
50 / 58 papers shown
Title
DYSTIL: Dynamic Strategy Induction with Large Language Models for Reinforcement Learning
Borui Wang
Kathleen McKeown
Rex Ying
OffRL
39
0
0
06 May 2025
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh
Pradeep Varakantham
Peter Vamplew
OffRL
34
0
0
02 Mar 2025
Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability
Qingyue Zhao
Kaixuan Ji
Heyang Zhao
Tong Zhang
Q. Gu
OffRL
40
0
0
09 Feb 2025
The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking
Yuchun Miao
Sen Zhang
Liang Ding
Yuqi Zhang
L. Zhang
Dacheng Tao
81
3
0
31 Jan 2025
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning
Rémy Hosseinkhan Boucher
Onofrio Semeraro
L. Mathelin
74
0
0
28 Jan 2025
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
Kun Wu
Yinuo Zhao
Z. Xu
Zhengping Che
Chengxiang Yin
C. Liu
Qinru Qiu
Feiferi Feng
OffRL
100
1
0
22 Dec 2024
Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
Eliot Xing
Vernon Luk
Jean Oh
84
0
0
16 Dec 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
57
3
0
07 Nov 2024
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification
Suorong Yang
F. Shen
Jian Zhao
AAML
29
1
0
10 Sep 2024
q-exponential family for policy optimization
Lingwei Zhu
Haseeb Shah
Han Wang
Yukie Nagai
Martha White
OffRL
73
0
0
14 Aug 2024
Reinforcement Learning for Efficient Design and Control Co-optimisation of Energy Systems
Marine Cauz
Adrien Bolland
Nicolas Wyrsch
Christophe Ballif
37
0
0
28 Jun 2024
Optimal Rates of Convergence for Entropy Regularization in Discounted Markov Decision Processes
Johannes Muller
Semih Cayci
34
0
0
06 Jun 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
38
2
0
30 May 2024
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Alessandro Montenegro
Marco Mussi
Alberto Maria Metelli
Matteo Papini
42
2
0
03 May 2024
Behind the Myth of Exploration in Policy Gradients
Adrien Bolland
Gaspard Lambrechts
Damien Ernst
51
0
0
31 Jan 2024
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
27
2
0
26 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
14
1
0
23 Jan 2024
Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks
Andrew Starnes
Anton Dereventsov
Clayton Webster
24
0
0
09 Oct 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
33
3
0
11 May 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
36
71
0
28 Mar 2023
Adaptive Federated Learning via New Entropy Approach
Shensheng Zheng
Wenhao Yuan
Xuehe Wang
Ling-Yu Duan
FedML
OOD
22
1
0
27 Mar 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
21
0
0
22 Mar 2023
Multi-Start Team Orienteering Problem for UAS Mission Re-Planning with Data-Efficient Deep Reinforcement Learning
Dong Ho Lee
Jaemyung Ahn
20
6
0
02 Mar 2023
Differentiable Arbitrating in Zero-sum Markov Games
Jing Wang
Meichen Song
Feng Gao
Boyi Liu
Zhaoran Wang
Yi Wu
32
2
0
20 Feb 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Mengdi Wang
Furong Huang
Dinesh Manocha
24
7
0
28 Jan 2023
Robust Scheduling with GFlowNets
David W. Zhang
Corrado Rainone
M. Peschl
Roberto Bondesan
29
48
0
17 Jan 2023
Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks
Anton Dereventsov
Andrew Starnes
Clayton Webster
18
4
0
21 Nov 2022
Hybrid Learning- and Model-Based Planning and Control of In-Hand Manipulation
Rana Soltani-Zarrin
K. Yamane
Rianna M. Jitosho
46
7
0
20 Sep 2022
Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization
Leo Feng
Padideh Nouri
Aneri Muni
Yoshua Bengio
Pierre-Luc Bacon
108
4
0
13 Sep 2022
Entropy Regularization for Population Estimation
Ben Chugg
Peter Henderson
Jacob Goldin
Daniel E. Ho
26
3
0
24 Aug 2022
Entropy Augmented Reinforcement Learning
Jianfei Ma
28
0
0
19 Aug 2022
Accelerating Primal-dual Methods for Regularized Markov Decision Processes
Haoya Li
Hsiang-Fu Yu
Lexing Ying
Inderjit Dhillon
26
4
0
21 Feb 2022
A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning
Youssef Diouane
Aurélien Lucchi
Vihang Patil
19
3
0
21 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
38
2
0
15 Feb 2022
A Differential Entropy Estimator for Training Neural Networks
Georg Pichler
Pierre Colombo
Malik Boudiaf
Günther Koliander
Pablo Piantanida
20
21
0
14 Feb 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
Matilde Gargiani
Andrea Zanelli
Andrea Martinelli
Tyler H. Summers
John Lygeros
33
14
0
01 Feb 2022
Demystifying Reinforcement Learning in Time-Varying Systems
Pouya Hamadanian
Malte Schwarzkopf
Siddartha Sen
MohammadIman Alizadeh
35
1
0
14 Jan 2022
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning
Xiang Li
Wenhao Yang
Jiadong Liang
Zhihua Zhang
Michael I. Jordan
32
15
0
29 Dec 2021
Variational Quantum Soft Actor-Critic
Qingfeng Lan
14
20
0
20 Dec 2021
Reward-Free Attacks in Multi-Agent Reinforcement Learning
Ted Fujimoto
T. Doster
A. Attarian
Jill M. Brandenberger
Nathan Oken Hodas
AAML
19
4
0
02 Dec 2021
Towards an Understanding of Default Policies in Multitask Policy Optimization
Theodore H. Moskovitz
Michael Arbel
Jack Parker-Holder
Aldo Pacchiano
17
9
0
04 Nov 2021
Approximate Newton policy gradient algorithms
Haoya Li
Samarth Gupta
Hsiangfu Yu
Lexing Ying
Inderjit Dhillon
41
2
0
05 Oct 2021
Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning
Jiahui Li
Kun Kuang
Baoxiang Wang
Furui Liu
Long Chen
Fei Wu
Jun Xiao
OffRL
10
60
0
01 Jun 2021
Safe Chance Constrained Reinforcement Learning for Batch Process Control
M. Mowbray
Panagiotis Petsagkourakis
Ehecatl Antonio del Rio Chanona
Dongda Zhang
OffRL
27
34
0
23 Apr 2021
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
Benjamin Eysenbach
Sergey Levine
OOD
24
174
0
10 Mar 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
19
10
0
11 Feb 2021
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
29
2
0
04 Jan 2021
Behavior Priors for Efficient Reinforcement Learning
Dhruva Tirumala
Alexandre Galashov
Hyeonwoo Noh
Leonard Hasenclever
Razvan Pascanu
...
Guillaume Desjardins
Wojciech M. Czarnecki
Arun Ahuja
Yee Whye Teh
N. Heess
30
39
0
27 Oct 2020
Softmax Deep Double Deterministic Policy Gradients
Ling Pan
Qingpeng Cai
Longbo Huang
72
86
0
19 Oct 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
29
79
0
17 Sep 2020
1
2
Next