ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.12776
  4. Cited By
Efficiently Solving MDPs with Stochastic Mirror Descent

Efficiently Solving MDPs with Stochastic Mirror Descent

International Conference on Machine Learning (ICML), 2020
28 August 2020
Yujia Jin
Aaron Sidford
ArXiv (abs)PDFHTML

Papers citing "Efficiently Solving MDPs with Stochastic Mirror Descent"

46 / 46 papers shown
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
Daniil Medyakov
Gleb Molodtsov
Grigoriy Evseev
Egor Petrov
Aleksandr Beznosikov
337
3
0
04 Sep 2025
Apprenticeship learning with prior beliefs using inverse optimization
Apprenticeship learning with prior beliefs using inverse optimization
Mauricio Junca
Esteban Leiva
200
0
0
27 May 2025
Layer-wise Quantization for Quantized Optimistic Dual Averaging
Layer-wise Quantization for Quantized Optimistic Dual Averaging
Anh Duc Nguyen
Ilia Markov
Frank Zhengqing Wu
Ali Ramezani-Kebrya
Kimon Antonakopoulos
Dan Alistarh
Volkan Cevher
MQ
260
1
0
20 May 2025
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Lixing Lyu
Jiashuo Jiang
Wang Chi Cheung
275
3
0
24 Feb 2025
Computing Optimal Regularizers for Online Linear Optimization
Computing Optimal Regularizers for Online Linear OptimizationAnnual Conference Computational Learning Theory (COLT), 2024
Khashayar Gatmiry
Jon Schneider
Stefanie Jegelka
179
3
0
22 Oct 2024
Finding good policies in average-reward Markov Decision Processes
  without prior knowledge
Finding good policies in average-reward Markov Decision Processes without prior knowledge
Adrienne Tuynman
Rémy Degenne
Emilie Kaufmann
272
9
0
27 May 2024
Truncated Variance Reduced Value Iteration
Truncated Variance Reduced Value Iteration
Yujia Jin
Ishani Karmarkar
Aaron Sidford
Jiayi Wang
OffRL
272
7
0
21 May 2024
Stochastic Halpern iteration in normed spaces and applications to reinforcement learning
Stochastic Halpern iteration in normed spaces and applications to reinforcement learning
Mario Bravo
Juan Pablo Contreras
364
9
0
19 Mar 2024
Span-Based Optimal Sample Complexity for Weakly Communicating and
  General Average Reward MDPs
Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPsNeural Information Processing Systems (NeurIPS), 2024
M. Zurek
Yudong Chen
367
11
0
18 Mar 2024
Dealing with unbounded gradients in stochastic saddle-point optimization
Dealing with unbounded gradients in stochastic saddle-point optimization
Gergely Neu
Nneka Okolo
348
5
0
21 Feb 2024
Scalable and Independent Learning of Nash Equilibrium Policies in
  $n$-Player Stochastic Games with Unknown Independent Chains
Scalable and Independent Learning of Nash Equilibrium Policies in nnn-Player Stochastic Games with Unknown Independent Chains
Tiancheng Qin
S. Rasoul Etesami
300
2
0
04 Dec 2023
Span-Based Optimal Sample Complexity for Average Reward MDPs
Span-Based Optimal Sample Complexity for Average Reward MDPs
M. Zurek
Yudong Chen
290
9
0
22 Nov 2023
Optimal Sample Complexity for Average Reward Markov Decision Processes
Optimal Sample Complexity for Average Reward Markov Decision ProcessesInternational Conference on Learning Representations (ICLR), 2023
Shengbo Wang
Jose H. Blanchet
Peter Glynn
303
15
0
13 Oct 2023
Sharper Model-free Reinforcement Learning for Average-reward Markov
  Decision Processes
Sharper Model-free Reinforcement Learning for Average-reward Markov Decision ProcessesAnnual Conference Computational Learning Theory (COLT), 2023
Zihan Zhang
Qiaomin Xie
OffRL
198
25
0
28 Jun 2023
Last-Iterate Convergent Policy Gradient Primal-Dual Methods for
  Constrained MDPs
Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPsNeural Information Processing Systems (NeurIPS), 2023
Dongsheng Ding
Chen-Yu Wei
Jianchao Tan
Alejandro Ribeiro
339
31
0
20 Jun 2023
A Central Limit Theorem for Algorithmic Estimator of Saddle Point
A Central Limit Theorem for Algorithmic Estimator of Saddle Point
Abhishek Roy
Yian Ma
318
1
0
09 Jun 2023
First Order Methods with Markovian Noise: from Acceleration to
  Variational Inequalities
First Order Methods with Markovian Noise: from Acceleration to Variational InequalitiesNeural Information Processing Systems (NeurIPS), 2023
Aleksandr Beznosikov
S. Samsonov
Marina Sheshukova
Alexander Gasnikov
A. Naumov
Eric Moulines
277
18
0
25 May 2023
Similarity, Compression and Local Steps: Three Pillars of Efficient
  Communications for Distributed Variational Inequalities
Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational InequalitiesNeural Information Processing Systems (NeurIPS), 2023
Aleksandr Beznosikov
Martin Takáč
Alexander Gasnikov
300
13
0
15 Feb 2023
Optimal Sample Complexity of Reinforcement Learning for Mixing
  Discounted Markov Decision Processes
Optimal Sample Complexity of Reinforcement Learning for Mixing Discounted Markov Decision Processes
Shengbo Wang
Jose H. Blanchet
Peter Glynn
292
7
0
15 Feb 2023
Reducing Blackwell and Average Optimality to Discounted MDPs via the
  Blackwell Discount Factor
Reducing Blackwell and Average Optimality to Discounted MDPs via the Blackwell Discount FactorNeural Information Processing Systems (NeurIPS), 2023
Julien Grand-Clément
Marko Petrik
243
22
0
31 Jan 2023
An Efficient Stochastic Algorithm for Decentralized
  Nonconvex-Strongly-Concave Minimax Optimization
An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax OptimizationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Le‐Yu Chen
Haishan Ye
Luo Luo
503
9
0
05 Dec 2022
Near Sample-Optimal Reduction-based Policy Learning for Average Reward
  MDP
Near Sample-Optimal Reduction-based Policy Learning for Average Reward MDP
Jinghan Wang
Meng-Xian Wang
Lin F. Yang
234
25
0
01 Dec 2022
Efficient Global Planning in Large MDPs via Stochastic Primal-Dual
  Optimization
Efficient Global Planning in Large MDPs via Stochastic Primal-Dual OptimizationInternational Conference on Algorithmic Learning Theory (ALT), 2022
Gergely Neu
Nneka Okolo
371
10
0
21 Oct 2022
Proximal Point Imitation Learning
Proximal Point Imitation LearningNeural Information Processing Systems (NeurIPS), 2022
Luca Viano
Angeliki Kamoutsi
Gergely Neu
Igor Krawczuk
Volkan Cevher
457
20
0
22 Sep 2022
Smooth Monotone Stochastic Variational Inequalities and Saddle Point
  Problems: A Survey
Smooth Monotone Stochastic Variational Inequalities and Saddle Point Problems: A SurveyEuropean Mathematical Society Magazine (EMS Magazine), 2022
Aleksandr Beznosikov
Boris Polyak
Eduard A. Gorbunov
D. Kovalev
Alexander Gasnikov
315
33
0
29 Aug 2022
Algorithm for Constrained Markov Decision Process with Linear
  Convergence
Algorithm for Constrained Markov Decision Process with Linear ConvergenceInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
E. Gladin
Maksim Lavrik-Karmazin
K. Zainullina
Varvara Rudenko
Alexander V. Gasnikov
Martin Takáč
250
9
0
03 Jun 2022
Stochastic first-order methods for average-reward Markov decision
  processes
Stochastic first-order methods for average-reward Markov decision processesMathematics of Operations Research (MOR), 2022
Tianjiao Li
Feiyang Wu
Guanghui Lan
493
23
0
11 May 2022
Solving optimization problems with Blackwell approachability
Solving optimization problems with Blackwell approachabilityMathematics of Operations Research (MOR), 2022
Julien Grand-Clément
Christian Kroer
175
5
0
24 Feb 2022
Optimal Algorithms for Decentralized Stochastic Variational Inequalities
Optimal Algorithms for Decentralized Stochastic Variational InequalitiesNeural Information Processing Systems (NeurIPS), 2022
D. Kovalev
Aleksandr Beznosikov
Abdurakhmon Sadiev
Michael Persiianov
Peter Richtárik
Alexander Gasnikov
281
38
0
06 Feb 2022
Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic
  Games with Independent Chains
Learning Stationary Nash Equilibrium Policies in nnn-Player Stochastic Games with Independent ChainsSIAM Journal of Control and Optimization (SICON), 2022
S. Rasoul Etesami
393
9
0
28 Jan 2022
Optimal variance-reduced stochastic approximation in Banach spaces
Optimal variance-reduced stochastic approximation in Banach spaces
Wenlong Mou
K. Khamaru
Martin J. Wainwright
Peter L. Bartlett
Sai Li
244
10
0
21 Jan 2022
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from
  Demonstrations
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from DemonstrationsInternational Conference on Machine Learning (ICML), 2021
Angeliki Kamoutsi
G. Banjac
John Lygeros
OffRL
191
9
0
28 Dec 2021
Quantum Algorithms for Reinforcement Learning with a Generative Model
Quantum Algorithms for Reinforcement Learning with a Generative Model
Daochen Wang
Aarthi Sundaram
Robin Kothari
Ashish Kapoor
M. Rötteler
209
37
0
15 Dec 2021
Distributed Methods with Compressed Communication for Solving
  Variational Inequalities, with Theoretical Guarantees
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees
Aleksandr Beznosikov
Peter Richtárik
Michael Diskin
Max Ryabinin
Alexander Gasnikov
FedML
306
22
0
07 Oct 2021
Distributed Saddle-Point Problems Under Similarity
Distributed Saddle-Point Problems Under Similarity
Aleksandr Beznosikov
G. Scutari
Alexander Rogozin
Alexander Gasnikov
425
14
0
22 Jul 2021
Bregman Gradient Policy Optimization
Bregman Gradient Policy Optimization
Feihu Huang
Shangqian Gao
Heng-Chiao Huang
433
19
0
23 Jun 2021
Decentralized Local Stochastic Extra-Gradient for Variational
  Inequalities
Decentralized Local Stochastic Extra-Gradient for Variational Inequalities
Aleksandr Beznosikov
Pavel Dvurechensky
Anastasia Koloskova
V. Samokhin
Sebastian U. Stich
Alexander Gasnikov
299
45
0
15 Jun 2021
Decentralized Personalized Federated Learning for Min-Max Problems
Decentralized Personalized Federated Learning for Min-Max Problems
Ekaterina Borodich
Aleksandr Beznosikov
Abdurakhmon Sadiev
V. Sushko
Nikolay Savelyev
Martin Takávc
Alexander Gasnikov
FedML
371
4
0
14 Jun 2021
Towards Tight Bounds on the Sample Complexity of Average-reward MDPs
Towards Tight Bounds on the Sample Complexity of Average-reward MDPsInternational Conference on Machine Learning (ICML), 2021
Yujia Jin
Aaron Sidford
112
41
0
13 Jun 2021
Distributionally Robust Optimization with Markovian Data
Distributionally Robust Optimization with Markovian DataInternational Conference on Machine Learning (ICML), 2021
Mengmeng Li
Tobias Sutter
Daniel Kuhn
131
11
0
12 Jun 2021
Reward is enough for convex MDPs
Reward is enough for convex MDPsNeural Information Processing Systems (NeurIPS), 2021
Tom Zahavy
Brendan O'Donoghue
Guillaume Desjardins
Satinder Singh
330
81
0
01 Jun 2021
Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point
  Solving
Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point SolvingNeural Information Processing Systems (NeurIPS), 2021
Julien Grand-Clément
Christian Kroer
316
5
0
27 May 2021
Near Optimal Policy Optimization via REPS
Near Optimal Policy Optimization via REPSNeural Information Processing Systems (NeurIPS), 2021
Aldo Pacchiano
Jonathan Lee
Peter L. Bartlett
Ofir Nachum
191
3
0
17 Mar 2021
Primal-Dual Stochastic Mirror Descent for MDPs
Primal-Dual Stochastic Mirror Descent for MDPsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
D. Tiapkin
Alexander V. Gasnikov
314
7
0
27 Feb 2021
Decentralized Distributed Optimization for Saddle Point Problems
Decentralized Distributed Optimization for Saddle Point ProblemsOptimization Methods and Software (OMS), 2021
Alexander Rogozin
Alexander Beznosikov
D. Dvinskikh
D. Kovalev
Pavel Dvurechensky
Alexander Gasnikov
434
25
0
15 Feb 2021
Long-Term Resource Allocation Fairness in Average Markov Decision
  Process (AMDP) Environment
Long-Term Resource Allocation Fairness in Average Markov Decision Process (AMDP) EnvironmentAdaptive Agents and Multi-Agent Systems (AAMAS), 2021
Ganesh Ghalme
V. Nair
Vishakha Patil
Yilun Zhou
208
6
0
14 Feb 2021
1