Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2407.16602
Cited By
v1
v2 (latest)
Functional Acceleration for Policy Mirror Descent
23 July 2024
Veronica Chelu
Doina Precup
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Functional Acceleration for Policy Mirror Descent"
31 / 31 papers shown
Title
On the Convergence of Policy Mirror Descent with Temporal Difference Evaluation
Jiacai Liu
Wenye Li
Ke Wei
72
0
0
23 Sep 2025
Policy Mirror Descent with Lookahead
Kimon Protopapas
Anas Barakat
174
3
0
21 Mar 2024
Diversifying AI: Towards Creative Chess with AlphaZero
Tom Zahavy
Vivek Veeriah
Shaobo Hou
Kevin Waugh
Matthew Lai
Edouard Leurent
Nenad Tomašev
Lisa Schut
Demis Hassabis
Satinder Singh
231
20
0
17 Aug 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees
Neural Information Processing Systems (NeurIPS), 2023
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
269
5
0
24 May 2023
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes
Neural Information Processing Systems (NeurIPS), 2023
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
270
15
0
22 Feb 2023
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Neural Information Processing Systems (NeurIPS), 2023
Carlo Alfano
Rui Yuan
Patrick Rebeschini
453
18
0
30 Jan 2023
No-Regret Dynamics in the Fenchel Game: A Unified Framework for Algorithmic Convex Optimization
Mathematical programming (Math. Program.), 2021
Jun-Kun Wang
Jacob D. Abernethy
Kfir Y. Levy
385
28
0
22 Nov 2021
Understanding the Effect of Stochasticity in Policy Optimization
Neural Information Processing Systems (NeurIPS), 2021
Jincheng Mei
Bo Dai
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
216
20
0
29 Oct 2021
Approximation Benefits of Policy Gradient Methods with Aggregated States
Daniel Russo
318
7
0
22 Jul 2020
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
243
67
0
21 Jul 2020
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
491
96
0
20 May 2020
On the Global Convergence Rates of Softmax Policy Gradient Methods
Jincheng Mei
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
398
321
0
13 May 2020
Momentum in Reinforcement Learning
International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
Nino Vieillard
B. Scherrer
Olivier Pietquin
Matthieu Geist
163
35
0
21 Oct 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Annual Conference Computational Learning Theory (COLT), 2019
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
350
330
0
01 Aug 2019
Global Optimality Guarantees For Policy Gradient Methods
Operational Research (OR), 2019
Jalaj Bhandari
Daniel Russo
316
217
0
05 Jun 2019
Learning When-to-Treat Policies
Journal of the American Statistical Association (JASA), 2019
Xinkun Nie
Emma Brunskill
Stefan Wager
CML
OffRL
226
97
0
23 May 2019
The Value Function Polytope in Reinforcement Learning
International Conference on Machine Learning (ICML), 2019
Robert Dadashi
Adrien Ali Taïga
Nicolas Le Roux
Dale Schuurmans
Marc G. Bellemare
170
51
0
31 Jan 2019
Predictor-Corrector Policy Optimization
Ching-An Cheng
Xinyan Yan
Nathan D. Ratliff
Byron Boots
OnRL
184
24
0
15 Oct 2018
Acceleration through Optimistic No-Regret Dynamics
Jun-Kun Wang
Jacob D. Abernethy
338
46
0
27 Jul 2018
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
183
522
0
14 Jun 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
1.2K
9,825
0
04 Jan 2018
Rainbow: Combining Improvements in Deep Reinforcement Learning
Matteo Hessel
Joseph Modayil
H. V. Hasselt
Tom Schaul
Georg Ostrovski
Will Dabney
Dan Horgan
Bilal Piot
M. G. Azar
David Silver
OffRL
222
2,458
0
06 Oct 2017
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
246
1,673
0
21 Jul 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
1.1K
23,432
0
20 Jul 2017
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
698
9,529
0
04 Feb 2016
Accelerating Optimization via Adaptive Prediction
M. Mohri
Scott Yang
AI4CE
233
8
0
18 Sep 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Sai Li
Pieter Abbeel
646
7,389
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
International Conference on Learning Representations (ICLR), 2014
Diederik P. Kingma
Jimmy Ba
ODL
4.4K
160,138
0
22 Dec 2014
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
267
13,129
0
19 Dec 2013
Online Learning with Predictable Sequences
Annual Conference Computational Learning Theory (COLT), 2012
Alexander Rakhlin
Karthik Sridharan
383
396
0
18 Aug 2012
Solving variational inequalities with Stochastic Mirror-Prox algorithm
A. Juditsky
A. Nemirovskii
Claire Tauvel
392
462
0
04 Sep 2008
1