Understanding the impact of entropy on policy optimization

27 November 2018

Nicolas Le Roux

Papers citing "Understanding the impact of entropy on policy optimization"

50 / 58 papers shown

Title
DYSTIL: Dynamic Strategy Induction with Large Language Models for Reinforcement Learning Borui Wang Kathleen McKeown Rex Ying OffRL 39 0 0 06 May 2025
On Generalization Across Environments In Multi-Objective Reinforcement Learning Jayden Teoh Pradeep Varakantham Peter Vamplew OffRL 34 0 0 02 Mar 2025
Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability Qingyue Zhao Kaixuan Ji Heyang Zhao Tong Zhang Q. Gu OffRL 40 0 0 09 Feb 2025
The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking Yuchun Miao Sen Zhang Liang Ding Yuqi Zhang L. Zhang Dacheng Tao 81 3 0 31 Jan 2025
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning Rémy Hosseinkhan Boucher Onofrio Semeraro L. Mathelin 74 0 0 28 Jan 2025
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning Kun Wu Yinuo Zhao Z. Xu Zhengping Che Chengxiang Yin C. Liu Qinru Qiu Feiferi Feng OffRL 100 1 0 22 Dec 2024
Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation Eliot Xing Vernon Luk Jean Oh 84 0 0 16 Dec 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF Heyang Zhao Chenlu Ye Quanquan Gu Tong Zhang OffRL 57 3 0 07 Nov 2024
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification Suorong Yang F. Shen Jian Zhao AAML 29 1 0 10 Sep 2024
q-exponential family for policy optimization Lingwei Zhu Haseeb Shah Han Wang Yukie Nagai Martha White OffRL 73 0 0 14 Aug 2024
Reinforcement Learning for Efficient Design and Control Co-optimisation of Energy Systems Marine Cauz Adrien Bolland Nicolas Wyrsch Christophe Ballif 37 0 0 28 Jun 2024
Optimal Rates of Convergence for Entropy Regularization in Discounted Markov Decision Processes Johannes Muller Semih Cayci 34 0 0 06 Jun 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity Yan Yang Bin Gao Ya-xiang Yuan 38 2 0 30 May 2024
Learning Optimal Deterministic Policies with Stochastic Policy Gradients Alessandro Montenegro Marco Mussi Alberto Maria Metelli Matteo Papini 42 2 0 03 May 2024
Behind the Myth of Exploration in Policy Gradients Adrien Bolland Gaspard Lambrechts Damien Ernst 51 0 0 31 Jan 2024
Regularized Q-Learning with Linear Function Approximation Jiachen Xi Alfredo Garcia P. Momcilovic 27 2 0 26 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization Ling Liang Haizhao Yang 14 1 0 23 Jan 2024
Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks Andrew Starnes Anton Dereventsov Clayton Webster 24 0 0 09 Oct 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation Adrien Bolland Gilles Louppe D. Ernst 33 3 0 11 May 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization Haoran Xu Li Jiang Jianxiong Li Zhuoran Yang Zhaoran Wang Victor Chan Xianyuan Zhan OffRL 36 71 0 28 Mar 2023
Adaptive Federated Learning via New Entropy Approach Shensheng Zheng Wenhao Yuan Xuehe Wang Ling-Yu Duan FedML OOD 22 1 0 27 Mar 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality François Ged M. H. Veiga 21 0 0 22 Mar 2023
Multi-Start Team Orienteering Problem for UAS Mission Re-Planning with Data-Efficient Deep Reinforcement Learning Dong Ho Lee Jaemyung Ahn 20 6 0 02 Mar 2023
Differentiable Arbitrating in Zero-sum Markov Games Jing Wang Meichen Song Feng Gao Boyi Liu Zhaoran Wang Yi Wu 32 2 0 20 Feb 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning Souradip Chakraborty Amrit Singh Bedi Alec Koppel Mengdi Wang Furong Huang Dinesh Manocha 24 7 0 28 Jan 2023
Robust Scheduling with GFlowNets David W. Zhang Corrado Rainone M. Peschl Roberto Bondesan 29 48 0 17 Jan 2023
Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks Anton Dereventsov Andrew Starnes Clayton Webster 18 4 0 21 Nov 2022
Hybrid Learning- and Model-Based Planning and Control of In-Hand Manipulation Rana Soltani-Zarrin K. Yamane Rianna M. Jitosho 46 7 0 20 Sep 2022
Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization Leo Feng Padideh Nouri Aneri Muni Yoshua Bengio Pierre-Luc Bacon 108 4 0 13 Sep 2022
Entropy Regularization for Population Estimation Ben Chugg Peter Henderson Jacob Goldin Daniel E. Ho 26 3 0 24 Aug 2022
Entropy Augmented Reinforcement Learning Jianfei Ma 28 0 0 19 Aug 2022
Accelerating Primal-dual Methods for Regularized Markov Decision Processes Haoya Li Hsiang-Fu Yu Lexing Ying Inderjit Dhillon 26 4 0 21 Feb 2022
A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning Youssef Diouane Aurélien Lucchi Vihang Patil 19 3 0 21 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms Romain Laroche Rémi Tachet des Combes 38 2 0 15 Feb 2022
A Differential Entropy Estimator for Training Neural Networks Georg Pichler Pierre Colombo Malik Boudiaf Günther Koliander Pablo Piantanida 20 21 0 14 Feb 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation Matilde Gargiani Andrea Zanelli Andrea Martinelli Tyler H. Summers John Lygeros 33 14 0 01 Feb 2022
Demystifying Reinforcement Learning in Time-Varying Systems Pouya Hamadanian Malte Schwarzkopf Siddartha Sen MohammadIman Alizadeh 35 1 0 14 Jan 2022
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning Xiang Li Wenhao Yang Jiadong Liang Zhihua Zhang Michael I. Jordan 32 15 0 29 Dec 2021
Variational Quantum Soft Actor-Critic Qingfeng Lan 14 20 0 20 Dec 2021
Reward-Free Attacks in Multi-Agent Reinforcement Learning Ted Fujimoto T. Doster A. Attarian Jill M. Brandenberger Nathan Oken Hodas AAML 19 4 0 02 Dec 2021
Towards an Understanding of Default Policies in Multitask Policy Optimization Theodore H. Moskovitz Michael Arbel Jack Parker-Holder Aldo Pacchiano 17 9 0 04 Nov 2021
Approximate Newton policy gradient algorithms Haoya Li Samarth Gupta Hsiangfu Yu Lexing Ying Inderjit Dhillon 41 2 0 05 Oct 2021
Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning Jiahui Li Kun Kuang Baoxiang Wang Furui Liu Long Chen Fei Wu Jun Xiao OffRL 10 60 0 01 Jun 2021
Safe Chance Constrained Reinforcement Learning for Batch Process Control M. Mowbray Panagiotis Petsagkourakis Ehecatl Antonio del Rio Chanona Dongda Zhang OffRL 27 34 0 23 Apr 2021
Maximum Entropy RL (Provably) Solves Some Robust RL Problems Benjamin Eysenbach Sergey Levine OOD 24 174 0 10 Mar 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration N. Lazić Botao Hao Yasin Abbasi-Yadkori Dale Schuurmans Csaba Szepesvári 19 10 0 11 Feb 2021
Advances in Electron Microscopy with Deep Learning Jeffrey M. Ede 29 2 0 04 Jan 2021
Behavior Priors for Efficient Reinforcement Learning Dhruva Tirumala Alexandre Galashov Hyeonwoo Noh Leonard Hasenclever Razvan Pascanu ... Guillaume Desjardins Wojciech M. Czarnecki Arun Ahuja Yee Whye Teh N. Heess 30 39 0 27 Oct 2020
Softmax Deep Double Deterministic Policy Gradients Ling Pan Qingpeng Cai Longbo Huang 72 86 0 19 Oct 2020
Review: Deep Learning in Electron Microscopy Jeffrey M. Ede 29 79 0 17 Sep 2020