ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.04017
  4. Cited By
Provable Self-Play Algorithms for Competitive Reinforcement Learning
v1v2v3 (latest)

Provable Self-Play Algorithms for Competitive Reinforcement Learning

International Conference on Machine Learning (ICML), 2020
10 February 2020
Yu Bai
Chi Jin
    SSL
ArXiv (abs)PDFHTML

Papers citing "Provable Self-Play Algorithms for Competitive Reinforcement Learning"

50 / 109 papers shown
Proximal Regret and Proximal Correlated Equilibria: A New Tractable Solution Concept for Online Learning and Games
Proximal Regret and Proximal Correlated Equilibria: A New Tractable Solution Concept for Online Learning and Games
Yang Cai
C. Daskalakis
Haipeng Luo
Chen-Yu Wei
Weiqiang Zheng
276
1
0
03 Nov 2025
Game-Theoretic Understandings of Multi-Agent Systems with Multiple Objectives
Game-Theoretic Understandings of Multi-Agent Systems with Multiple Objectives
Yue Wang
195
0
0
27 Sep 2025
Language Self-Play For Data-Free Training
Language Self-Play For Data-Free Training
Jakub Grudzien Kuba
Mengting Gu
Qi Ma
Yuandong Tian
Vijai Mohan
Jason Chen
SyDa
489
21
0
09 Sep 2025
Sample-Efficient Distributionally Robust Multi-Agent Reinforcement Learning via Online Interaction
Sample-Efficient Distributionally Robust Multi-Agent Reinforcement Learning via Online Interaction
Zain Ulabedeen Farhat
Debamita Ghosh
George Atia
Yue Wang
211
2
0
04 Aug 2025
Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
Till Freihaut
Luca Viano
Volkan Cevher
Matthieu Geist
Giorgia Ramponi
326
2
0
23 May 2025
The Lagrangian Method for Solving Constrained Markov Games
The Lagrangian Method for Solving Constrained Markov Games
Soham Das
Santiago Paternain
Luiz F. O. Chamon
Ceyhun Eksin
355
0
0
13 Mar 2025
Learning in Markov Games with Adaptive Adversaries: Policy Regret,
  Fundamental Barriers, and Efficient Algorithms
Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient AlgorithmsNeural Information Processing Systems (NeurIPS), 2024
Thanh Nguyen-Tang
Raman Arora
413
1
0
01 Nov 2024
Transformers as Game Players: Provable In-context Game-playing
  Capabilities of Pre-trained Models
Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained ModelsNeural Information Processing Systems (NeurIPS), 2024
Chengshuai Shi
Kun Yang
Jing Yang
Cong Shen
261
0
0
13 Oct 2024
Efficient Reinforcement Learning in Probabilistic Reward Machines
Efficient Reinforcement Learning in Probabilistic Reward MachinesAAAI Conference on Artificial Intelligence (AAAI), 2024
Xiaofeng Lin
Xuezhou Zhang
298
2
0
19 Aug 2024
Efficacy of Language Model Self-Play in Non-Zero-Sum Games
Efficacy of Language Model Self-Play in Non-Zero-Sum Games
Austen Liao
Nicholas Tomlin
Dan Klein
375
10
0
27 Jun 2024
Competing for pixels: a self-play algorithm for weakly-supervised
  segmentation
Competing for pixels: a self-play algorithm for weakly-supervised segmentation
Shaheer U. Saeed
Shiqi Huang
João Ramalhinho
Iani J. M. B. Gayo
Nina Montaña-Brown
...
Stephen P. Pereira
Brian R. Davidson
D. Barratt
Matthew J. Clarkson
Yipeng Hu
329
0
0
26 May 2024
Taming Equilibrium Bias in Risk-Sensitive Multi-Agent Reinforcement
  Learning
Taming Equilibrium Bias in Risk-Sensitive Multi-Agent Reinforcement Learning
Yingjie Fei
Ruitu Xu
220
1
0
04 May 2024
Provably Efficient Information-Directed Sampling Algorithms for
  Multi-Agent Reinforcement Learning
Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning
Qiaosheng Zhang
Chenjia Bai
Shuyue Hu
Zhen Wang
Xuelong Li
325
2
0
30 Apr 2024
Differentially Private Reinforcement Learning with Self-Play
Differentially Private Reinforcement Learning with Self-Play
Dan Qiao
Yu Wang
281
0
0
11 Apr 2024
DP-Dueling: Learning from Preference Feedback without Compromising User
  Privacy
DP-Dueling: Learning from Preference Feedback without Compromising User Privacy
Aadirupa Saha
Hilal Asi
305
1
0
22 Mar 2024
Provably Efficient Partially Observable Risk-Sensitive Reinforcement
  Learning with Hindsight Observation
Provably Efficient Partially Observable Risk-Sensitive Reinforcement Learning with Hindsight Observation
Tonghe Zhang
Yu Chen
Longbo Huang
269
0
0
28 Feb 2024
Refined Sample Complexity for Markov Games with Independent Linear
  Function Approximation
Refined Sample Complexity for Markov Games with Independent Linear Function ApproximationAnnual Conference Computational Learning Theory (COLT), 2024
Yan Dai
Qiwen Cui
S. S. Du
397
2
0
11 Feb 2024
$\widetilde{O}(T^{-1})$ Convergence to (Coarse) Correlated Equilibria in
  Full-Information General-Sum Markov Games
O~(T−1)\widetilde{O}(T^{-1})O(T−1) Convergence to (Coarse) Correlated Equilibria in Full-Information General-Sum Markov Games
Weichao Mao
Haoran Qiu
Chen Wang
Hubertus Franke
Zbigniew T. Kalbarczyk
Tamer Basar
263
0
0
02 Feb 2024
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity
  Constraints
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
Dan Qiao
Yu Wang
OffRL
303
4
0
02 Feb 2024
Sample-Efficient Multi-Agent RL: An Optimization Perspective
Sample-Efficient Multi-Agent RL: An Optimization PerspectiveInternational Conference on Learning Representations (ICLR), 2023
Nuoya Xiong
Zhihan Liu
Zhaoran Wang
Zhuoran Yang
316
2
0
10 Oct 2023
VDFD: Multi-Agent Value Decomposition Framework with Disentangled World Model
VDFD: Multi-Agent Value Decomposition Framework with Disentangled World Model
Zhizun Wang
David Meger
DRL
348
4
0
08 Sep 2023
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov
  Games
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2023
Songtao Feng
Ming Yin
Yu Wang
J. Yang
Yitao Liang
167
1
0
17 Aug 2023
Efficient Adversarial Attacks on Online Multi-agent Reinforcement
  Learning
Efficient Adversarial Attacks on Online Multi-agent Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Guanlin Liu
Lifeng Lai
AAML
225
18
0
15 Jul 2023
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Multi-Player Zero-Sum Markov Games with Networked Separable InteractionsNeural Information Processing Systems (NeurIPS), 2023
Chanwoo Park
Jianchao Tan
Asuman Ozdaglar
386
13
0
13 Jul 2023
Provably Efficient Generalized Lagrangian Policy Optimization for Safe
  Multi-Agent Reinforcement Learning
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement LearningConference on Learning for Dynamics & Control (L4DC), 2023
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
Mihailo R. Jovanović
OffRL
382
14
0
31 May 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning,
  and Exploration
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and ExplorationNeural Information Processing Systems (NeurIPS), 2023
Zhihan Liu
Miao Lu
Wei Xiong
Han Zhong
Haotian Hu
Shenao Zhang
Sirui Zheng
Zhuoran Yang
Zhaoran Wang
OffRL
380
27
0
29 May 2023
Provably Feedback-Efficient Reinforcement Learning via Active Reward
  Learning
Provably Feedback-Efficient Reinforcement Learning via Active Reward LearningNeural Information Processing Systems (NeurIPS), 2023
Dingwen Kong
Lin F. Yang
270
16
0
18 Apr 2023
A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum
  Markov Games
A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games
Anna Winnicki
R. Srikant
425
2
0
17 Mar 2023
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games
  with Bandit Feedback
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit FeedbackNeural Information Processing Systems (NeurIPS), 2023
Yang Cai
Haipeng Luo
Chen-Yu Wei
Weiqiang Zheng
269
29
0
05 Mar 2023
A Finite-Sample Analysis of Payoff-Based Independent Learning in
  Zero-Sum Stochastic Games
A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic GamesNeural Information Processing Systems (NeurIPS), 2023
Zaiwei Chen
Jianchao Tan
Eric Mazumdar
Asuman Ozdaglar
Adam Wierman
375
16
0
03 Mar 2023
Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Can We Find Nash Equilibria at a Linear Rate in Markov Games?International Conference on Learning Representations (ICLR), 2023
Zhuoqing Song
Jason D. Lee
Zhuoran Yang
400
10
0
03 Mar 2023
Breaking the Curse of Multiagency: Provably Efficient Decentralized
  Multi-Agent RL with Function Approximation
Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function ApproximationAnnual Conference Computational Learning Theory (COLT), 2023
Yuanhao Wang
Qinghua Liu
Yunru Bai
Chi Jin
342
38
0
13 Feb 2023
Efficient Planning in Combinatorial Action Spaces with Applications to
  Cooperative Multi-Agent Reinforcement Learning
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Volodymyr Tkachuk
Seyed Alireza Bakhtiari
Johannes Kirschner
Matej Jusup
Ilija Bogunovic
Csaba Szepesvári
256
7
0
08 Feb 2023
Breaking the Curse of Multiagents in a Large State Space: RL in Markov
  Games with Independent Linear Function Approximation
Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function ApproximationAnnual Conference Computational Learning Theory (COLT), 2023
Qiwen Cui
Jianchao Tan
S. Du
428
29
0
07 Feb 2023
Population-size-Aware Policy Optimization for Mean-Field Games
Population-size-Aware Policy Optimization for Mean-Field GamesInternational Conference on Learning Representations (ICLR), 2023
Pengdeng Li
Xinrun Wang
Shuxin Li
Hau Chan
Bo An
236
3
0
07 Feb 2023
Robust Subtask Learning for Compositional Generalization
Robust Subtask Learning for Compositional GeneralizationInternational Conference on Machine Learning (ICML), 2023
Kishor Jothimurugan
Steve Hsu
Osbert Bastani
Rajeev Alur
OffRL
261
7
0
06 Feb 2023
Offline Learning in Markov Games with General Function Approximation
Offline Learning in Markov Games with General Function ApproximationInternational Conference on Machine Learning (ICML), 2023
Yuheng Zhang
Yunru Bai
Nan Jiang
OffRL
374
12
0
06 Feb 2023
A Reduction-based Framework for Sequential Decision Making with Delayed
  Feedback
A Reduction-based Framework for Sequential Decision Making with Delayed FeedbackNeural Information Processing Systems (NeurIPS), 2023
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
576
10
0
03 Feb 2023
Decentralized model-free reinforcement learning in stochastic games with
  average-reward objective
Decentralized model-free reinforcement learning in stochastic games with average-reward objectiveAdaptive Agents and Multi-Agent Systems (AAMAS), 2023
Romain Cravic
Nicolas Gast
B. Gaujal
195
2
0
13 Jan 2023
Provably Efficient Model-free RL in Leader-Follower MDP with Linear
  Function Approximation
Provably Efficient Model-free RL in Leader-Follower MDP with Linear Function ApproximationConference on Learning for Dynamics & Control (L4DC), 2022
A. Ghosh
250
2
0
28 Nov 2022
Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization
Nesterov Meets Optimism: Rate-Optimal Separable Minimax OptimizationInternational Conference on Machine Learning (ICML), 2022
C. J. Li
An Yuan
Gauthier Gidel
Quanquan Gu
Michael I. Jordan
247
8
0
31 Oct 2022
On the convergence of policy gradient methods to Nash equilibria in
  general stochastic games
On the convergence of policy gradient methods to Nash equilibria in general stochastic gamesNeural Information Processing Systems (NeurIPS), 2022
Angeliki Giannou
Kyriakos Lotidis
P. Mertikopoulos
Emmanouil-Vasileios Vlatakis-Gkaragkounis
340
25
0
17 Oct 2022
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2022
Wei Xiong
Han Zhong
Chengshuai Shi
Cong Shen
Tong Zhang
204
23
0
04 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
  Markov Games
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov GamesInternational Conference on Learning Representations (ICLR), 2022
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
494
46
0
03 Oct 2022
$O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in
  Two-Player Zero-Sum Markov Games
O(T−1)O(T^{-1})O(T−1) Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games
Yuepeng Yang
Cong Ma
250
17
0
26 Sep 2022
Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model
Minimax-Optimal Multi-Agent RL in Markov Games With a Generative ModelNeural Information Processing Systems (NeurIPS), 2022
Gen Li
Yuejie Chi
Yuting Wei
Yuxin Chen
419
20
0
22 Aug 2022
Learning Two-Player Mixture Markov Games: Kernel Function Approximation
  and Correlated Equilibrium
Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium
C. J. Li
Dongruo Zhou
Quanquan Gu
Sai Li
173
2
0
10 Aug 2022
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
  Markov Games with Structured Transitions
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured TransitionsInternational Conference on Machine Learning (ICML), 2022
Delin Qu
Xiaohan Wei
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
206
12
0
25 Jul 2022
A Deep Reinforcement Learning Approach for Finding Non-Exploitable
  Strategies in Two-Player Atari Games
A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games
Zihan Ding
DiJia Su
Qinghua Liu
Chi Jin
320
3
0
18 Jul 2022
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear
  RL
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RLNeural Information Processing Systems (NeurIPS), 2022
Jinglin Chen
Aditya Modi
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
390
29
0
21 Jun 2022
123
Next
Page 1 of 3