ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.16602
  4. Cited By
Functional Acceleration for Policy Mirror Descent
v1v2 (latest)

Functional Acceleration for Policy Mirror Descent

23 July 2024
Veronica Chelu
Doina Precup
ArXiv (abs)PDFHTML

Papers citing "Functional Acceleration for Policy Mirror Descent"

31 / 31 papers shown
Title
On the Convergence of Policy Mirror Descent with Temporal Difference Evaluation
On the Convergence of Policy Mirror Descent with Temporal Difference Evaluation
Jiacai Liu
Wenye Li
Ke Wei
72
0
0
23 Sep 2025
Policy Mirror Descent with Lookahead
Policy Mirror Descent with Lookahead
Kimon Protopapas
Anas Barakat
174
3
0
21 Mar 2024
Diversifying AI: Towards Creative Chess with AlphaZero
Diversifying AI: Towards Creative Chess with AlphaZero
Tom Zahavy
Vivek Veeriah
Shaobo Hou
Kevin Waugh
Matthew Lai
Edouard Leurent
Nenad Tomašev
Lisa Schut
Demis Hassabis
Satinder Singh
231
20
0
17 Aug 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical
  Guarantees
Decision-Aware Actor-Critic with Function Approximation and Theoretical GuaranteesNeural Information Processing Systems (NeurIPS), 2023
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
269
5
0
24 May 2023
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted
  Markov Decision Processes
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision ProcessesNeural Information Processing Systems (NeurIPS), 2023
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
270
15
0
22 Feb 2023
A Novel Framework for Policy Mirror Descent with General
  Parameterization and Linear Convergence
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear ConvergenceNeural Information Processing Systems (NeurIPS), 2023
Carlo Alfano
Rui Yuan
Patrick Rebeschini
453
18
0
30 Jan 2023
No-Regret Dynamics in the Fenchel Game: A Unified Framework for
  Algorithmic Convex Optimization
No-Regret Dynamics in the Fenchel Game: A Unified Framework for Algorithmic Convex OptimizationMathematical programming (Math. Program.), 2021
Jun-Kun Wang
Jacob D. Abernethy
Kfir Y. Levy
385
28
0
22 Nov 2021
Understanding the Effect of Stochasticity in Policy Optimization
Understanding the Effect of Stochasticity in Policy OptimizationNeural Information Processing Systems (NeurIPS), 2021
Jincheng Mei
Bo Dai
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
216
20
0
29 Oct 2021
Approximation Benefits of Policy Gradient Methods with Aggregated States
Approximation Benefits of Policy Gradient Methods with Aggregated States
Daniel Russo
318
7
0
22 Jul 2020
On Linear Convergence of Policy Gradient Methods for Finite MDPs
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
243
67
0
21 Jul 2020
Mirror Descent Policy Optimization
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
491
96
0
20 May 2020
On the Global Convergence Rates of Softmax Policy Gradient Methods
On the Global Convergence Rates of Softmax Policy Gradient Methods
Jincheng Mei
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
398
321
0
13 May 2020
Momentum in Reinforcement Learning
Momentum in Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2019
Nino Vieillard
B. Scherrer
Olivier Pietquin
Matthieu Geist
163
35
0
21 Oct 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and
  Distribution Shift
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution ShiftAnnual Conference Computational Learning Theory (COLT), 2019
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
350
330
0
01 Aug 2019
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient MethodsOperational Research (OR), 2019
Jalaj Bhandari
Daniel Russo
316
217
0
05 Jun 2019
Learning When-to-Treat Policies
Learning When-to-Treat PoliciesJournal of the American Statistical Association (JASA), 2019
Xinkun Nie
Emma Brunskill
Stefan Wager
CMLOffRL
226
97
0
23 May 2019
The Value Function Polytope in Reinforcement Learning
The Value Function Polytope in Reinforcement LearningInternational Conference on Machine Learning (ICML), 2019
Robert Dadashi
Adrien Ali Taïga
Nicolas Le Roux
Dale Schuurmans
Marc G. Bellemare
170
51
0
31 Jan 2019
Predictor-Corrector Policy Optimization
Predictor-Corrector Policy Optimization
Ching-An Cheng
Xinyan Yan
Nathan D. Ratliff
Byron Boots
OnRL
184
24
0
15 Oct 2018
Acceleration through Optimistic No-Regret Dynamics
Acceleration through Optimistic No-Regret Dynamics
Jun-Kun Wang
Jacob D. Abernethy
338
46
0
27 Jul 2018
Maximum a Posteriori Policy Optimisation
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
183
522
0
14 Jun 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
1.2K
9,825
0
04 Jan 2018
Rainbow: Combining Improvements in Deep Reinforcement Learning
Rainbow: Combining Improvements in Deep Reinforcement Learning
Matteo Hessel
Joseph Modayil
H. V. Hasselt
Tom Schaul
Georg Ostrovski
Will Dabney
Dan Horgan
Bilal Piot
M. G. Azar
David Silver
OffRL
222
2,458
0
06 Oct 2017
A Distributional Perspective on Reinforcement Learning
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
246
1,673
0
21 Jul 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
1.1K
23,432
0
20 Jul 2017
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
698
9,529
0
04 Feb 2016
Accelerating Optimization via Adaptive Prediction
Accelerating Optimization via Adaptive Prediction
M. Mohri
Scott Yang
AI4CE
233
8
0
18 Sep 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Sai Li
Pieter Abbeel
646
7,389
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic OptimizationInternational Conference on Learning Representations (ICLR), 2014
Diederik P. Kingma
Jimmy Ba
ODL
4.4K
160,138
0
22 Dec 2014
Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
267
13,129
0
19 Dec 2013
Online Learning with Predictable Sequences
Online Learning with Predictable SequencesAnnual Conference Computational Learning Theory (COLT), 2012
Alexander Rakhlin
Karthik Sridharan
383
396
0
18 Aug 2012
Solving variational inequalities with Stochastic Mirror-Prox algorithm
Solving variational inequalities with Stochastic Mirror-Prox algorithm
A. Juditsky
A. Nemirovskii
Claire Tauvel
392
462
0
04 Sep 2008
1