ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.02738
  4. Cited By
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games
  with Bandit Feedback

Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback

5 March 2023
Yang Cai
Haipeng Luo
Chen-Yu Wei
Weiqiang Zheng
ArXivPDFHTML

Papers citing "Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback"

10 / 10 papers shown
Title
Decentralized Online Learning in General-Sum Stackelberg Games
Decentralized Online Learning in General-Sum Stackelberg Games
Yaolong Yu
Haipeng Chen
22
0
0
06 May 2024
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Gokul Swamy
Christoph Dann
Rahul Kidambi
Zhiwei Steven Wu
Alekh Agarwal
OffRL
28
94
0
08 Jan 2024
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Chanwoo Park
K. Zhang
Asuman Ozdaglar
21
8
0
13 Jul 2023
Doubly Optimal No-Regret Learning in Monotone Games
Doubly Optimal No-Regret Learning in Monotone Games
Yang Cai
Weiqiang Zheng
33
11
0
30 Jan 2023
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
Wei Xiong
Han Zhong
Chengshuai Shi
Cong Shen
Tong Zhang
53
19
0
04 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
  Markov Games
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
41
35
0
03 Oct 2022
$O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in
  Two-Player Zero-Sum Markov Games
O(T−1)O(T^{-1})O(T−1) Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games
Yuepeng Yang
Cong Ma
33
14
0
26 Sep 2022
Uncoupled Bandit Learning towards Rationalizability: Benchmarks,
  Barriers, and Algorithms
Uncoupled Bandit Learning towards Rationalizability: Benchmarks, Barriers, and Algorithms
Jibang Wu
Haifeng Xu
Fan Yao
22
1
0
10 Nov 2021
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov
  Games
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
Yulai Zhao
Yuandong Tian
Jason D. Lee
S. Du
OffRL
39
18
0
17 Feb 2021
Independent Policy Gradient Methods for Competitive Reinforcement
  Learning
Independent Policy Gradient Methods for Competitive Reinforcement Learning
C. Daskalakis
Dylan J. Foster
Noah Golowich
57
158
0
11 Jan 2021
1