ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.06279
  4. Cited By
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov
  Games with Perfect Recall

Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall

11 June 2021
Tadashi Kozuno
Pierre Ménard
Rémi Munos
Michal Valko
ArXivPDFHTML

Papers citing "Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall"

10 / 10 papers shown
Title
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Yuheng Zhang
Dian Yu
Tao Ge
Linfeng Song
Zhichen Zeng
Haitao Mi
Nan Jiang
Dong Yu
58
1
0
24 Feb 2025
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang
Dian Yu
Baolin Peng
Linfeng Song
Ye Tian
Mingyue Huo
Nan Jiang
Haitao Mi
Dong Yu
35
14
0
30 Jun 2024
A Deep Reinforcement Learning Approach for Finding Non-Exploitable
  Strategies in Two-Player Atari Games
A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games
Zihan Ding
DiJia Su
Qinghua Liu
Chi Jin
30
3
0
18 Jul 2022
Approximate Nash Equilibrium Learning for n-Player Markov Games in
  Dynamic Pricing
Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing
Larkin Liu
25
1
0
13 Jul 2022
Sample-Efficient Reinforcement Learning of Partially Observable Markov
  Games
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games
Qinghua Liu
Csaba Szepesvári
Chi Jin
26
20
0
02 Jun 2022
Efficient Phi-Regret Minimization in Extensive-Form Games via Online
  Mirror Descent
Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent
Yu Bai
Chi Jin
Song Mei
Ziang Song
Tiancheng Yu
OffRL
52
18
0
30 May 2022
Learning Markov Games with Adversarial Opponents: Efficient Algorithms
  and Fundamental Limits
Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits
Qinghua Liu
Yuanhao Wang
Chi Jin
AAML
21
15
0
14 Mar 2022
Generalized Bandit Regret Minimizer Framework in Imperfect Information Extensive-Form Game
Lin Meng
Yang Gao
37
1
0
11 Mar 2022
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
Yunru Bai
Chi Jin
Song Mei
Tiancheng Yu
21
26
0
03 Feb 2022
Faster Game Solving via Predictive Blackwell Approachability: Connecting
  Regret Matching and Mirror Descent
Faster Game Solving via Predictive Blackwell Approachability: Connecting Regret Matching and Mirror Descent
Gabriele Farina
Christian Kroer
T. Sandholm
46
72
0
28 Jul 2020
1