ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.01604
  4. Cited By
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
v1v2 (latest)

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

International Conference on Machine Learning (ICML), 2020
4 October 2020
Qinghua Liu
Tiancheng Yu
Yu Bai
Chi Jin
ArXiv (abs)PDFHTML

Papers citing "A Sharp Analysis of Model-based Reinforcement Learning with Self-Play"

50 / 96 papers shown
Title
Game-Theoretic Understandings of Multi-Agent Systems with Multiple Objectives
Game-Theoretic Understandings of Multi-Agent Systems with Multiple Objectives
Yue Wang
104
0
0
27 Sep 2025
Multi-Agent Reinforcement Learning in Intelligent Transportation Systems: A Comprehensive Survey
Multi-Agent Reinforcement Learning in Intelligent Transportation Systems: A Comprehensive Survey
RexCharles Donatus
Kumater Ter
Ore-Ofe Ajayi
Daniel Udekwe
120
1
0
27 Aug 2025
Online Robust Multi-Agent Reinforcement Learning under Model Uncertainties
Online Robust Multi-Agent Reinforcement Learning under Model Uncertainties
Zain Ulabedeen Farhat
Debamita Ghosh
George Atia
Yue Wang
130
1
0
04 Aug 2025
Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
Till Freihaut
Luca Viano
Volkan Cevher
Matthieu Geist
Giorgia Ramponi
200
1
0
23 May 2025
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Yuheng Zhang
Dian Yu
Tao Ge
Linfeng Song
Zhichen Zeng
Haitao Mi
Nan Jiang
Dong Yu
284
10
0
24 Feb 2025
Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games
Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games
Tong Yang
Bo Dai
Lin Xiao
Yuejie Chi
OffRL
399
3
0
13 Feb 2025
Learning in Markov Games with Adaptive Adversaries: Policy Regret,
  Fundamental Barriers, and Efficient Algorithms
Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient AlgorithmsNeural Information Processing Systems (NeurIPS), 2024
Thanh Nguyen-Tang
Raman Arora
306
1
0
01 Nov 2024
Applying Neural Monte Carlo Tree Search to Unsignalized
  Multi-intersection Scheduling for Autonomous Vehicles
Applying Neural Monte Carlo Tree Search to Unsignalized Multi-intersection Scheduling for Autonomous VehiclesIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2024
Yucheng Shi
Wenlong Wang
Xiaowen Tao
Ivana Dusparic
Vinny Cahill
133
1
0
24 Oct 2024
Transformers as Game Players: Provable In-context Game-playing
  Capabilities of Pre-trained Models
Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained ModelsNeural Information Processing Systems (NeurIPS), 2024
Chengshuai Shi
Kun Yang
Jing Yang
Cong Shen
156
0
0
13 Oct 2024
Efficient Reinforcement Learning in Probabilistic Reward Machines
Efficient Reinforcement Learning in Probabilistic Reward MachinesAAAI Conference on Artificial Intelligence (AAAI), 2024
Xiaofeng Lin
Xuezhou Zhang
220
2
0
19 Aug 2024
Decentralized Online Learning in General-Sum Stackelberg Games
Decentralized Online Learning in General-Sum Stackelberg GamesConference on Uncertainty in Artificial Intelligence (UAI), 2024
Yaolong Yu
Haipeng Chen
223
0
0
06 May 2024
Taming Equilibrium Bias in Risk-Sensitive Multi-Agent Reinforcement
  Learning
Taming Equilibrium Bias in Risk-Sensitive Multi-Agent Reinforcement Learning
Yingjie Fei
Ruitu Xu
157
0
0
04 May 2024
MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games
MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games
Anran Hu
Junzi Zhang
242
7
0
01 May 2024
Provably Efficient Information-Directed Sampling Algorithms for
  Multi-Agent Reinforcement Learning
Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning
Qiaosheng Zhang
Chenjia Bai
Shuyue Hu
Zhen Wang
Xuelong Li
243
2
0
30 Apr 2024
Differentially Private Reinforcement Learning with Self-Play
Differentially Private Reinforcement Learning with Self-Play
Dan Qiao
Yu Wang
214
0
0
11 Apr 2024
DP-Dueling: Learning from Preference Feedback without Compromising User
  Privacy
DP-Dueling: Learning from Preference Feedback without Compromising User Privacy
Aadirupa Saha
Hilal Asi
212
1
0
22 Mar 2024
RL in Markov Games with Independent Function Approximation: Improved
  Sample Complexity Bound under the Local Access Model
RL in Markov Games with Independent Function Approximation: Improved Sample Complexity Bound under the Local Access ModelInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Junyi Fan
Yuxuan Han
Jialin Zeng
Jian-Feng Cai
Yang Wang
Yang Xiang
Jiheng Zhang
352
1
0
18 Mar 2024
Refined Sample Complexity for Markov Games with Independent Linear
  Function Approximation
Refined Sample Complexity for Markov Games with Independent Linear Function ApproximationAnnual Conference Computational Learning Theory (COLT), 2024
Yan Dai
Qiwen Cui
S. S. Du
282
1
0
11 Feb 2024
Principled Penalty-based Methods for Bilevel Reinforcement Learning and
  RLHF
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHFInternational Conference on Machine Learning (ICML), 2024
Han Shen
Zhuoran Yang
Tianyi Chen
OffRL
327
28
0
10 Feb 2024
$\widetilde{O}(T^{-1})$ Convergence to (Coarse) Correlated Equilibria in
  Full-Information General-Sum Markov Games
O~(T−1)\widetilde{O}(T^{-1})O(T−1) Convergence to (Coarse) Correlated Equilibria in Full-Information General-Sum Markov Games
Weichao Mao
Haoran Qiu
Chen Wang
Hubertus Franke
Zbigniew T. Kalbarczyk
Tamer Basar
188
0
0
02 Feb 2024
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity
  Constraints
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
Dan Qiao
Yu Wang
OffRL
226
3
0
02 Feb 2024
Guarantees for Self-Play in Multiplayer Games via Polymatrix
  Decomposability
Guarantees for Self-Play in Multiplayer Games via Polymatrix DecomposabilityNeural Information Processing Systems (NeurIPS), 2023
Revan MacQueen
James R. Wright
220
2
0
17 Oct 2023
Sample-Efficient Multi-Agent RL: An Optimization Perspective
Sample-Efficient Multi-Agent RL: An Optimization PerspectiveInternational Conference on Learning Representations (ICLR), 2023
Nuoya Xiong
Zhihan Liu
Zhaoran Wang
Zhuoran Yang
247
1
0
10 Oct 2023
Local and adaptive mirror descents in extensive-form games
Local and adaptive mirror descents in extensive-form gamesNeural Information Processing Systems (NeurIPS), 2023
Côme Fiegel
Pierre Ménard
Tadashi Kozuno
Rémi Munos
Vianney Perchet
Michal Valko
227
3
0
01 Sep 2023
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov
  Games
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2023
Songtao Feng
Ming Yin
Yu Wang
J. Yang
Yitao Liang
122
2
0
17 Aug 2023
Efficient Adversarial Attacks on Online Multi-agent Reinforcement
  Learning
Efficient Adversarial Attacks on Online Multi-agent Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Guanlin Liu
Lifeng Lai
AAML
166
16
0
15 Jul 2023
Efficient Action Robust Reinforcement Learning with Probabilistic Policy
  Execution Uncertainty
Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty
Guanin Liu
Zhihan Zhou
Han Liu
Lifeng Lai
255
4
0
15 Jul 2023
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Multi-Player Zero-Sum Markov Games with Networked Separable InteractionsNeural Information Processing Systems (NeurIPS), 2023
Chanwoo Park
Jianchao Tan
Asuman Ozdaglar
313
9
0
13 Jul 2023
A Black-box Approach for Non-stationary Multi-agent Reinforcement
  Learning
A Black-box Approach for Non-stationary Multi-agent Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Haozhe Jiang
Qiwen Cui
Zhihan Xiong
Maryam Fazel
S. Du
195
6
0
12 Jun 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning,
  and Exploration
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and ExplorationNeural Information Processing Systems (NeurIPS), 2023
Zhihan Liu
Miao Lu
Wei Xiong
Han Zhong
Haotian Hu
Shenao Zhang
Sirui Zheng
Zhuoran Yang
Zhaoran Wang
OffRL
328
24
0
29 May 2023
Provably Feedback-Efficient Reinforcement Learning via Active Reward
  Learning
Provably Feedback-Efficient Reinforcement Learning via Active Reward LearningNeural Information Processing Systems (NeurIPS), 2023
Dingwen Kong
Lin F. Yang
225
16
0
18 Apr 2023
A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum
  Markov Games
A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games
Anna Winnicki
R. Srikant
318
2
0
17 Mar 2023
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games
  with Bandit Feedback
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit FeedbackNeural Information Processing Systems (NeurIPS), 2023
Yang Cai
Haipeng Luo
Chen-Yu Wei
Weiqiang Zheng
214
24
0
05 Mar 2023
A Finite-Sample Analysis of Payoff-Based Independent Learning in
  Zero-Sum Stochastic Games
A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic GamesNeural Information Processing Systems (NeurIPS), 2023
Zaiwei Chen
Jianchao Tan
Eric Mazumdar
Asuman Ozdaglar
Adam Wierman
313
12
0
03 Mar 2023
Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Can We Find Nash Equilibria at a Linear Rate in Markov Games?International Conference on Learning Representations (ICLR), 2023
Zhuoqing Song
Jason D. Lee
Zhuoran Yang
270
10
0
03 Mar 2023
Finite-sample Guarantees for Nash Q-learning with Linear Function
  Approximation
Finite-sample Guarantees for Nash Q-learning with Linear Function ApproximationConference on Uncertainty in Artificial Intelligence (UAI), 2023
Pedro Cisneros-Velarde
Oluwasanmi Koyejo
226
1
0
01 Mar 2023
Breaking the Curse of Multiagency: Provably Efficient Decentralized
  Multi-Agent RL with Function Approximation
Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function ApproximationAnnual Conference Computational Learning Theory (COLT), 2023
Yuanhao Wang
Qinghua Liu
Yunru Bai
Chi Jin
266
34
0
13 Feb 2023
Efficient Planning in Combinatorial Action Spaces with Applications to
  Cooperative Multi-Agent Reinforcement Learning
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Volodymyr Tkachuk
Seyed Alireza Bakhtiari
Johannes Kirschner
Matej Jusup
Ilija Bogunovic
Csaba Szepesvári
180
5
0
08 Feb 2023
Breaking the Curse of Multiagents in a Large State Space: RL in Markov
  Games with Independent Linear Function Approximation
Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function ApproximationAnnual Conference Computational Learning Theory (COLT), 2023
Qiwen Cui
Jianchao Tan
S. Du
315
28
0
07 Feb 2023
Population-size-Aware Policy Optimization for Mean-Field Games
Population-size-Aware Policy Optimization for Mean-Field GamesInternational Conference on Learning Representations (ICLR), 2023
Pengdeng Li
Xinrun Wang
Shuxin Li
Hau Chan
Bo An
193
2
0
07 Feb 2023
Offline Learning in Markov Games with General Function Approximation
Offline Learning in Markov Games with General Function ApproximationInternational Conference on Machine Learning (ICML), 2023
Yuheng Zhang
Yunru Bai
Nan Jiang
OffRL
269
10
0
06 Feb 2023
A Reduction-based Framework for Sequential Decision Making with Delayed
  Feedback
A Reduction-based Framework for Sequential Decision Making with Delayed FeedbackNeural Information Processing Systems (NeurIPS), 2023
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
430
9
0
03 Feb 2023
Learning from Multiple Independent Advisors in Multi-agent Reinforcement
  Learning
Learning from Multiple Independent Advisors in Multi-agent Reinforcement LearningAdaptive Agents and Multi-Agent Systems (AAMAS), 2023
Sriram Ganapathi Subramanian
Matthew E. Taylor
Kate Larson
Mark Crowley
135
1
0
26 Jan 2023
Adapting to game trees in zero-sum imperfect information games
Adapting to game trees in zero-sum imperfect information gamesInternational Conference on Machine Learning (ICML), 2022
Côme Fiegel
Pierre Ménard
Tadashi Kozuno
Rémi Munos
Vianney Perchet
Michal Valko
455
13
0
23 Dec 2022
Multi-Agent Reinforcement Learning with Reward Delays
Multi-Agent Reinforcement Learning with Reward DelaysConference on Learning for Dynamics & Control (L4DC), 2022
Yuyang Zhang
Runyu Zhang
Yu Gu
Na Li
199
12
0
02 Dec 2022
Provably Efficient Model-free RL in Leader-Follower MDP with Linear
  Function Approximation
Provably Efficient Model-free RL in Leader-Follower MDP with Linear Function ApproximationConference on Learning for Dynamics & Control (L4DC), 2022
A. Ghosh
166
2
0
28 Nov 2022
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov GamesInternational Conference on Machine Learning (ICML), 2022
Wei Xiong
Han Zhong
Chengshuai Shi
Cong Shen
Tong Zhang
164
21
0
04 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
  Markov Games
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov GamesInternational Conference on Learning Representations (ICLR), 2022
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
400
43
0
03 Oct 2022
$O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in
  Two-Player Zero-Sum Markov Games
O(T−1)O(T^{-1})O(T−1) Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games
Yuepeng Yang
Cong Ma
185
17
0
26 Sep 2022
Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model
Minimax-Optimal Multi-Agent RL in Markov Games With a Generative ModelNeural Information Processing Systems (NeurIPS), 2022
Gen Li
Yuejie Chi
Yuting Wei
Yuxin Chen
327
20
0
22 Aug 2022
12
Next