ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.10031
  4. Cited By
The Mirage of Action-Dependent Baselines in Reinforcement Learning

The Mirage of Action-Dependent Baselines in Reinforcement Learning

27 February 2018
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
    OffRL
ArXivPDFHTML

Papers citing "The Mirage of Action-Dependent Baselines in Reinforcement Learning"

26 / 26 papers shown
Title
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Wenjun Cao
52
0
0
26 Apr 2025
Multi-Fidelity Policy Gradient Algorithms
Multi-Fidelity Policy Gradient Algorithms
Xinjie Liu
Cyrus Neary
Kushagra Gupta
Christian Ellis
Ufuk Topcu
David Fridovich-Keil
OffRL
188
0
0
07 Mar 2025
Multi-agent Reinforcement Learning: A Comprehensive Survey
Multi-agent Reinforcement Learning: A Comprehensive Survey
Dom Huh
Prasant Mohapatra
AI4CE
36
8
0
15 Dec 2023
An Invitation to Deep Reinforcement Learning
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRL
OOD
78
5
0
13 Dec 2023
Distillation Policy Optimization
Distillation Policy Optimization
Jianfei Ma
OffRL
26
1
0
01 Feb 2023
The Role of Baselines in Policy Gradient Optimization
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
29
15
0
16 Jan 2023
On Many-Actions Policy Gradient
On Many-Actions Policy Gradient
Michal Nauman
Marek Cygan
19
0
0
24 Oct 2022
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned
  Reinforcement Learning
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning
Yunfei Li
Tian Gao
Jiaqi Yang
Huazhe Xu
Yi Wu
OffRL
28
22
0
24 Jun 2022
Settling the Variance of Multi-Agent Policy Gradients
Settling the Variance of Multi-Agent Policy Gradients
J. Kuba
Muning Wen
Yaodong Yang
Linghui Meng
Shangding Gu
Haifeng Zhang
D. Mguni
Jun Wang
24
58
0
19 Aug 2021
A Minimalist Approach to Offline Reinforcement Learning
A Minimalist Approach to Offline Reinforcement Learning
Scott Fujimoto
S. Gu
OffRL
58
780
0
12 Jun 2021
What Matters for Adversarial Imitation Learning?
What Matters for Adversarial Imitation Learning?
Manu Orsini
Anton Raichuk
Léonard Hussenot
Damien Vincent
Robert Dadashi
Sertan Girgin
M. Geist
Olivier Bachem
Olivier Pietquin
Marcin Andrychowicz
42
77
0
01 Jun 2021
Factored Policy Gradients: Leveraging Structure for Efficient Learning
  in MOMDPs
Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs
Thomas Spooner
N. Vadori
Sumitra Ganesh
30
7
0
20 Feb 2021
How to Make Deep RL Work in Practice
How to Make Deep RL Work in Practice
Nirnai Rao
Elie Aljalbout
Axel Sauer
Sami Haddadin
OffRL
21
11
0
25 Oct 2020
What Matters In On-Policy Reinforcement Learning? A Large-Scale
  Empirical Study
What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study
Marcin Andrychowicz
Anton Raichuk
Piotr Stańczyk
Manu Orsini
Sertan Girgin
...
M. Geist
Olivier Pietquin
Marcin Michalski
Sylvain Gelly
Olivier Bachem
OffRL
31
213
0
10 Jun 2020
SLM Lab: A Comprehensive Benchmark and Modular Software Framework for
  Reproducible Deep Reinforcement Learning
SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning
Keng Wah Loon
L. Graesser
Milan Cvitkovic
OffRL
16
13
0
28 Dec 2019
From Importance Sampling to Doubly Robust Policy Gradient
From Importance Sampling to Doubly Robust Policy Gradient
Jiawei Huang
Nan Jiang
OffRL
16
24
0
20 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
31
83
0
18 Sep 2019
Trajectory-wise Control Variates for Variance Reduction in Policy
  Gradient Methods
Trajectory-wise Control Variates for Variance Reduction in Policy Gradient Methods
Ching-An Cheng
Xinyan Yan
Byron Boots
22
22
0
08 Aug 2019
Sub-policy Adaptation for Hierarchical Reinforcement Learning
Sub-policy Adaptation for Hierarchical Reinforcement Learning
Alexander C. Li
Carlos Florensa
I. Clavera
Pieter Abbeel
23
71
0
13 Jun 2019
P3O: Policy-on Policy-off Policy Optimization
P3O: Policy-on Policy-off Policy Optimization
Rasool Fakoor
Pratik Chaudhari
Alex Smola
OffRL
15
51
0
05 May 2019
A Closer Look at Deep Policy Gradients
A Closer Look at Deep Policy Gradients
Andrew Ilyas
Logan Engstrom
Shibani Santurkar
Dimitris Tsipras
Firdaus Janoos
Larry Rudolph
Aleksander Madry
30
50
0
06 Nov 2018
A Survey and Critique of Multiagent Deep Reinforcement Learning
A Survey and Critique of Multiagent Deep Reinforcement Learning
Pablo Hernandez-Leal
Bilal Kartal
Matthew E. Taylor
OffRL
32
550
0
12 Oct 2018
Variance Reduction in Monte Carlo Counterfactual Regret Minimization
  (VR-MCCFR) for Extensive Form Games using Baselines
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Martin Schmid
Neil Burch
Marc Lanctot
Matej Moravcík
Rudolf Kadlec
Michael Bowling
26
64
0
09 Sep 2018
Variance Reduction for Reinforcement Learning in Input-Driven
  Environments
Variance Reduction for Reinforcement Learning in Input-Driven Environments
Hongzi Mao
S. Venkatakrishnan
Malte Schwarzkopf
Mohammad Alizadeh
OffRL
41
94
0
06 Jul 2018
Policy Optimization with Second-Order Advantage Information
Policy Optimization with Second-Order Advantage Information
Jiajin Li
Baoxiang Wang
22
6
0
09 May 2018
Backpropagation through the Void: Optimizing control variates for
  black-box gradient estimation
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
38
300
0
31 Oct 2017
1