What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

10 June 2020

Sertan Girgin

Olivier Pietquin

Olivier Bachem

Papers citing "What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study"

50 / 136 papers shown

Tactile-based Object Retrieval From Granular Media

Shuran Song

195

24 Dec 2025

Deep Reinforcement Learning for Dynamic Algorithm Configuration: A Case Study on Optimizing OneMax with the (1+(

λ

λ

03 Dec 2025

Differentiable Weightless Controllers: Learning Logic Circuits for Continuous Control

Fabian Kresse

Christoph H. Lampert

204

01 Dec 2025

Boosting Reinforcement Learning in 3D Visuospatial Tasks Through Human-Informed Curriculum Design

M. Solbach

John K. Tsotsos

OffRL

162

17 Nov 2025

Learning Without Critics? Revisiting GRPO in Classical Reinforcement Learning Environments

Bryan L. M. de Oliveira

Felipe V. Frujeri

Marcos P. C. M. Queiroz

Luana G. B. Martins

Telma W. de L. Soares

Luckeciano C. Melo

OffRL

175

05 Nov 2025

Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning

...

146

13 Oct 2025

Single-stream Policy Optimization

Zhongwen Xu

Zihan Ding

OffRL

187

16 Sep 2025

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

...

119

11 Aug 2025

...

OffRL ReLM MoE AI4TS LRM

304

12 Jun 2025

FAuNO: Semi-Asynchronous Federated Reinforcement Learning Framework for Task Offloading in Edge Systems

149

03 Jun 2025

Learning coordinated badminton skills for legged manipulators

269

29 May 2025

A critical assessment of reinforcement learning methods for microswimmer navigation in complex flows

Selim Mecanna

Aurore Loisy

Christophe Eloy

250

08 May 2025

Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance

Wenjun Cao

223

26 Apr 2025

Adaptive Insurance Reserving with CVaR-Constrained Reinforcement Learning under Macroeconomic Regimes

Stella C. Dong

James R. Finlay

154

13 Apr 2025

A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility

Christian Schroeder de Witt

Matthias Bethge

ReLM ALM LRM

602

09 Apr 2025

Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

427

03 Apr 2025

Differentiable Information Enhanced Model-Based Reinforcement LearningAAAI Conference on Artificial Intelligence (AAAI), 2025

248

03 Mar 2025

Average-Reward Soft Actor-Critic

264

15 Jan 2025

Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam TimestepsNeural Information Processing Systems (NeurIPS), 2024

Benjamin Ellis

Matthew Jackson

Andrei Lupu

Alexander David Goldie

Mattie Fellows

Shimon Whiteson

Jakob Foerster

354

22 Dec 2024

Multi-Task Reinforcement Learning for QuadrotorsIEEE Robotics and Automation Letters (RA-L), 2024

360

17 Dec 2024

A Method for Evaluating Hyperparameter Sensitivity in Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024

Jacob Adkins

Michael Bowling

Adam White

334

10 Dec 2024

Beyond the Boundaries of Proximal Policy Optimization

219

01 Nov 2024

Fast Deep Hedging with Second-Order OptimizationInternational Conference on AI in Finance (ICAF), 2024

247

29 Oct 2024

AgentForge: A Flexible Low-Code Platform for Reinforcement Learning Agent DesignInternational Conference on Agents and Artificial Intelligence (ICAART), 2024

Francisco Erivaldo Fernandes Junior

Antti Oulasvirta

1.1K

25 Oct 2024

Streaming Deep Reinforcement Learning Finally Works

277

18 Oct 2024

SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2024

464

13 Oct 2024

Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter EfficientInternational Conference on Learning Representations (ICLR), 2024

1.0K

11 Oct 2024

Effective Tuning Strategies for Generalist Robot Manipulation PoliciesIEEE International Conference on Robotics and Automation (ICRA), 2024

Wenbo Zhang

Yang Li

Jiajun Liu

Lingqiao Liu

173

02 Oct 2024

Gradient Boosting Reinforcement Learning

476

11 Jul 2024

Structural Design Through Reinforcement Learning

Thomas Rochefort-Beaudoin

143

10 Jul 2024

Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner

Kenneth Li

Yiming Wang

Fernanda Viégas

Martin Wattenberg

264

17 Jun 2024

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

347

04 Jun 2024

A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning

Arthur Juliani

Jordan T. Ash

OffRL OnRL CLL

280

29 May 2024

Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control

253

25 May 2024

Multi-turn Reinforcement Learning from Preference Human Feedback

Lior Shani

Aviv Rosenberg

Asaf B. Cassel

Oran Lang

Daniele Calandriello

...

Bilal Piot

Idan Szpektor

Avinatan Hassidim

Yossi Matias

Rémi Munos

219

23 May 2024

Decentralized Coordination of Distributed Energy Resources through Local Energy Markets and Deep Reinforcement Learning

Daniel May

Matthew E. Taylor

Petr Musílek

155

19 Apr 2024

If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions

305

25 Mar 2024

Simple Ingredients for Offline Reinforcement Learning

327

19 Mar 2024

Generalising Multi-Agent Cooperation through Task-Agnostic Communication

Dulhan Jayalath

Steven D. Morad

Amanda Prorok

164

11 Mar 2024

A Case for Validation Buffer in Pessimistic Actor-Critic

Michal Nauman

M. Ostaszewski

Marek Cygan

227

01 Mar 2024

Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

343

01 Mar 2024

Beacon, a lightweight deep reinforcement learning benchmark library for flow control

216

27 Feb 2024

Natural Language Reinforcement Learning

278

11 Feb 2024

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Rousslan Fernand Julien Dossa

...

263

05 Feb 2024

Behind the Myth of Exploration in Policy Gradients

Adrien Bolland

Gaspard Lambrechts

Damien Ernst

358

31 Jan 2024

The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations

Matthias Lehmann

283

24 Jan 2024

Retrieval-Guided Reinforcement Learning for Boolean Circuit MinimizationInternational Conference on Learning Representations (ICLR), 2024

193

22 Jan 2024

ReFT: Reasoning with Reinforced Fine-TuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

321

238

17 Jan 2024

EgoGen: An Egocentric Synthetic Data GeneratorComputer Vision and Pattern Recognition (CVPR), 2024

Marc Pollefeys

Siyu Tang

EgoV VGen

447

16 Jan 2024

XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX

455

19 Dec 2023