ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.05990
  4. Cited By
What Matters In On-Policy Reinforcement Learning? A Large-Scale
  Empirical Study

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

10 June 2020
Marcin Andrychowicz
Anton Raichuk
Piotr Stańczyk
Manu Orsini
Sertan Girgin
Raphaël Marinier
Léonard Hussenot
Matthieu Geist
Olivier Pietquin
Marcin Michalski
Sylvain Gelly
Olivier Bachem
    OffRL
ArXiv (abs)PDFHTML

Papers citing "What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study"

50 / 136 papers shown
An Invitation to Deep Reinforcement Learning
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRLOOD
497
9
0
13 Dec 2023
Guaranteed Trust Region Optimization via Two-Phase KL Penalization
Guaranteed Trust Region Optimization via Two-Phase KL Penalization
K.R. Zentner
Ujjwal Puri
Zhehui Huang
Gaurav Sukhatme
OffRL
170
0
0
08 Dec 2023
Dropout Strategy in Reinforcement Learning: Limiting the Surrogate
  Objective Variance in Policy Optimization Methods
Dropout Strategy in Reinforcement Learning: Limiting the Surrogate Objective Variance in Policy Optimization Methods
Zhengpeng Xie
Changdong Yu
Weizheng Qiao
401
2
0
31 Oct 2023
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General
  Sequential Decision Scenarios
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision ScenariosNeural Information Processing Systems (NeurIPS), 2023
Yazhe Niu
Yuan Pu
Zhenjie Yang
Xueyan Li
Tong Zhou
Jiyuan Ren
Shuai Hu
Jiaming Song
Yu Liu
384
20
0
12 Oct 2023
Reinforcement Learning for Node Selection in Branch-and-Bound
Reinforcement Learning for Node Selection in Branch-and-Bound
Alexander Mattick
Christopher Mutschler
294
4
0
29 Sep 2023
HyperPPO: A scalable method for finding small policies for robotic
  control
HyperPPO: A scalable method for finding small policies for robotic controlIEEE International Conference on Robotics and Automation (ICRA), 2023
Luming Tang
Zhehui Huang
Gaurav Sukhatme
234
7
0
28 Sep 2023
Evaluation of Constrained Reinforcement Learning Algorithms for Legged
  Locomotion
Evaluation of Constrained Reinforcement Learning Algorithms for Legged Locomotion
Joonho Lee
Lukas Schroth
Victor Klemm
Marko Bjelonic
Alexander Reske
Marco Hutter
213
22
0
27 Sep 2023
Reward Function Design for Crowd Simulation via Reinforcement Learning
Reward Function Design for Crowd Simulation via Reinforcement LearningMotion in Games (MiG), 2023
Ariel Kwiatkowski
Vicky Kalogeiton
Julien Pettré
Marie-Paule Cani
153
5
0
22 Sep 2023
Addressing imperfect symmetry: A novel symmetry-learning actor-critic extension
Addressing imperfect symmetry: A novel symmetry-learning actor-critic extension
Miguel Abreu
Luis Paulo Reis
Nuno Lau
311
8
0
06 Sep 2023
Commodities Trading through Deep Policy Gradient Methods
Commodities Trading through Deep Policy Gradient Methods
Jonas Hanetho
151
2
0
10 Aug 2023
Deep Reinforcement Learning for Autonomous Spacecraft Inspection using
  Illumination
Deep Reinforcement Learning for Autonomous Spacecraft Inspection using Illumination
David van Wijk
Kyle Dunlap
M. Majji
Kerianne L. Hobbs
154
11
0
04 Aug 2023
Benchmarking Potential Based Rewards for Learning Humanoid Locomotion
Benchmarking Potential Based Rewards for Learning Humanoid LocomotionIEEE International Conference on Robotics and Automation (ICRA), 2023
Seungmin Jeon
Steve Heim
Charles Khazoom
Sangbae Kim
OffRL
155
26
0
19 Jul 2023
Comparing Reinforcement Learning and Human Learning using the Game of
  Hidden Rules
Comparing Reinforcement Learning and Human Learning using the Game of Hidden RulesIEEE Access (IEEE Access), 2023
Eric Pulick
Vladimir Menkov
Yonatan Dov Mintz
Paul B. Kantor
Vicki M. Bier
OffRL
116
2
0
30 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High
  Dimensions
The RL Perceptron: Generalisation Dynamics of Policy Learning in High DimensionsPhysical Review X (PRX), 2023
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
455
7
0
17 Jun 2023
Emergent Agentic Transformer from Chain of Hindsight Experience
Emergent Agentic Transformer from Chain of Hindsight ExperienceInternational Conference on Machine Learning (ICML), 2023
Hao Liu
Pieter Abbeel
OffRL
269
33
0
26 May 2023
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement
  Learning
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning
Alexander Scarlatos
Andrew Lan
OffRLLRM
268
28
0
23 May 2023
INVICTUS: Optimizing Boolean Logic Circuit Synthesis via Synergistic
  Learning and Search
INVICTUS: Optimizing Boolean Logic Circuit Synthesis via Synergistic Learning and Search
A. B. Chowdhury
Marco Romanelli
Benjamin Tan
Ramesh Karri
S. Garg
211
3
0
22 May 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
287
4
0
11 May 2023
DEIR: Efficient and Robust Exploration through
  Discriminative-Model-Based Episodic Intrinsic Rewards
DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic RewardsInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Shanchuan Wan
Yujin Tang
Yingtao Tian
Tomoyuki Kaneko
OffRL
137
8
0
21 Apr 2023
Aiding reinforcement learning for set point control
Aiding reinforcement learning for set point controlIFAC-PapersOnLine (IFAC-PapersOnLine), 2023
Ruoqing Zhang
Per Mattsson
T. Wigren
188
3
0
20 Apr 2023
Robust nonlinear set-point control with reinforcement learning
Robust nonlinear set-point control with reinforcement learningAmerican Control Conference (ACC), 2023
Ruoqing Zhang
Per Mattsson
T. Wigren
OOD
150
2
0
20 Apr 2023
Tracker: Model-based Reinforcement Learning for Tracking Control of
  Human Finger Attached with Thin McKibben Muscles
Tracker: Model-based Reinforcement Learning for Tracking Control of Human Finger Attached with Thin McKibben MusclesIEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2023
Daichi Saito
Eri Nagatomo
Jefferson Pardomuan
Hideki Koike
117
0
0
01 Apr 2023
Autonomous Blimp Control via H-infinity Robust Deep Residual
  Reinforcement Learning
Autonomous Blimp Control via H-infinity Robust Deep Residual Reinforcement Learning
Yang Zuo
Y. Liu
Aamir Ahmad
95
2
0
24 Mar 2023
Order Matters: Agent-by-agent Policy Optimization
Order Matters: Agent-by-agent Policy OptimizationInternational Conference on Learning Representations (ICLR), 2023
Xihuai Wang
Zheng Tian
Bo Liu
Ying Wen
Jun Wang
Weinan Zhang
323
45
0
13 Feb 2023
Planning Multiple Epidemic Interventions with Reinforcement Learning
Planning Multiple Epidemic Interventions with Reinforcement LearningInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Anh Mai
Nikunj Gupta
A. Abouzeid
Dennis Shasha
267
7
0
30 Jan 2023
Mastering Diverse Domains through World Models
Mastering Diverse Domains through World Models
Danijar Hafner
J. Pašukonis
Jimmy Ba
Timothy Lillicrap
369
862
0
10 Jan 2023
Transformers as Policies for Variable Action Environments
Transformers as Policies for Variable Action Environments
Niklas Zwingenberger
113
3
0
09 Jan 2023
Backward Curriculum Reinforcement Learning
Backward Curriculum Reinforcement LearningIEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2022
Kyungmin Ko
OnRL
214
0
0
29 Dec 2022
Explainable and Safe Reinforcement Learning for Autonomous Air Mobility
Explainable and Safe Reinforcement Learning for Autonomous Air Mobility
Lei Wang
Hongyu Yang
Yi Lin
S. Yin
Yuankai Wu
131
6
0
24 Nov 2022
Reinforcement learning for traffic signal control in hybrid action space
Haoqing Luo
Sheng Jin
176
15
0
23 Nov 2022
Efficient Deep Reinforcement Learning with Predictive Processing
  Proximal Policy Optimization
Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy OptimizationNeurons, Behavior, Data analysis, and Theory (NBDT), 2022
Burcu Küçükoglu
Walraaf Borkent
Bodo Rueckauer
Nasir Ahmad
Umut Güçlü
Marcel van Gerven
265
2
0
11 Nov 2022
Understanding the Evolution of Linear Regions in Deep Reinforcement
  Learning
Understanding the Evolution of Linear Regions in Deep Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
S. Cohan
N. Kim
David Rolnick
M. van de Panne
172
7
0
24 Oct 2022
On Many-Actions Policy Gradient
On Many-Actions Policy GradientInternational Conference on Machine Learning (ICML), 2022
Michal Nauman
Marek Cygan
350
0
0
24 Oct 2022
Climate Change Policy Exploration using Reinforcement Learning
Climate Change Policy Exploration using Reinforcement Learning
Theodore Wolf
114
1
0
23 Oct 2022
The Impact of Task Underspecification in Evaluating Deep Reinforcement
  Learning
The Impact of Task Underspecification in Evaluating Deep Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Vindula Jayawardana
Catherine Tang
Sirui Li
Da Suo
Cathy Wu
OffRL
197
13
0
16 Oct 2022
GoalsEye: Learning High Speed Precision Table Tennis on a Physical Robot
GoalsEye: Learning High Speed Precision Table Tennis on a Physical RobotIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Tianli Ding
L. Graesser
Saminda Abeyruwan
David B. DÁmbrosio
Anish Shankar
P. Sermanet
Pannag R Sanketi
Corey Lynch
364
27
0
07 Oct 2022
Towards a Standardised Performance Evaluation Protocol for Cooperative
  MARL
Towards a Standardised Performance Evaluation Protocol for Cooperative MARLNeural Information Processing Systems (NeurIPS), 2022
R. Gorsane
Omayma Mahjoub
Ruan de Kock
Roland Dubb
Siddarth S. Singh
Arnu Pretorius
OffRL
231
62
0
21 Sep 2022
Understanding reinforcement learned crowds
Understanding reinforcement learned crowdsComputers & graphics (Comput. Graph.), 2022
Ariel Kwiatkowski
Vicky Kalogeiton
Julien Pettré
Marie-Paule Cani
130
10
0
19 Sep 2022
Grounding Aleatoric Uncertainty for Unsupervised Environment Design
Grounding Aleatoric Uncertainty for Unsupervised Environment DesignNeural Information Processing Systems (NeurIPS), 2022
Minqi Jiang
Michael Dennis
Jack Parker-Holder
Andrei Lupu
Heinrich Küttler
Edward Grefenstette
Tim Rocktaschel
Jakob N. Foerster
361
17
0
11 Jul 2022
Efficient entity-based reinforcement learning
Efficient entity-based reinforcement learning
Vince Jankovics
Michael Garcia Ortiz
Eduardo Alonso
OCLOffRL
128
1
0
06 Jun 2022
Challenges to Solving Combinatorially Hard Long-Horizon Deep RL Tasks
Challenges to Solving Combinatorially Hard Long-Horizon Deep RL Tasks
Andrew C. Li
Pashootan Vaezipoor
Rodrigo Toro Icarte
Sheila A. McIlraith
OffRLLRM
140
5
0
03 Jun 2022
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal
  Search
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal SearchInternational Conference on Learning Representations (ICLR), 2022
Michał Zawalski
Michał Tyrolski
K. Czechowski
Tomasz Odrzygó'zd'z
Damian Stachura
Piotr Pikekos
Yuhuai Wu
Lukasz Kuciñski
Piotr Milo's
LRM
568
13
0
01 Jun 2022
Frustratingly Easy Regularization on Representation Can Boost Deep
  Reinforcement Learning
Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement LearningComputer Vision and Pattern Recognition (CVPR), 2022
Qiang He
Huangyuan Su
Jieyu Zhang
Xinwen Hou
OODOffRL
190
9
0
29 May 2022
An Evaluation Study of Intrinsic Motivation Techniques applied to
  Reinforcement Learning over Hard Exploration Environments
An Evaluation Study of Intrinsic Motivation Techniques applied to Reinforcement Learning over Hard Exploration EnvironmentsInternational Cross-Domain Conference on Machine Learning and Knowledge Extraction (CD-MAKE), 2022
Alain Andres
Esther Villar-Rodriguez
Javier Del Ser
189
11
0
23 May 2022
Reinforcement Learning Policy Recommendation for Interbank Network
  Stability
Reinforcement Learning Policy Recommendation for Interbank Network StabilityJournal of Financial Stability (JFS), 2022
Alessio Brini
G. Tedeschi
Daniele Tantari
169
3
0
14 Apr 2022
Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and
  Stability
Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and Stability
Juan Jose Garau-Luis
Yingjie Miao
John D. Co-Reyes
Aaron T Parisi
Jie Tan
Esteban Real
Aleksandra Faust
244
0
0
08 Apr 2022
Combining imitation and deep reinforcement learning to accomplish
  human-level performance on a virtual foraging task
Combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging taskAdaptive Behavior (AB), 2022
Vittorio Giammarino
Matthew F. Dunne
Kylie N. Moore
Michael Hasselmo
Chantal E. Stern
I. Paschalidis
OffRL
354
5
0
11 Mar 2022
Learning Torque Control for Quadrupedal Locomotion
Learning Torque Control for Quadrupedal LocomotionIEEE-RAS International Conference on Humanoid Robots (Humanoids), 2022
Shuxiao Chen
Bike Zhang
M. Mueller
Akshara Rai
Koushil Sreenath
254
53
0
10 Mar 2022
A Survey on Reinforcement Learning Methods in Character Animation
A Survey on Reinforcement Learning Methods in Character Animation
Ariel Kwiatkowski
Eduardo Alvarado
Vicky Kalogeiton
Chenxi Liu
Julien Pettré
M. van de Panne
Marie-Paule Cani
AI4CE
264
63
0
07 Mar 2022
You May Not Need Ratio Clipping in PPO
You May Not Need Ratio Clipping in PPO
Mingfei Sun
Vitaly Kurin
Guoqing Liu
Sam Devlin
Tao Qin
Katja Hofmann
Shimon Whiteson
206
18
0
31 Jan 2022
Previous
123
Next
Page 2 of 3