ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.15378
  4. Cited By
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement
  Learning

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

30 June 2022
Julien Perolat
Bart De Vylder
Daniel Hennes
Eugene Tarassov
Florian Strub
V. D. Boer
Paul Muller
Jerome T. Connor
Neil Burch
Thomas W. Anthony
Stephen Marcus McAleer
Romuald Elie
Sarah H. Cen
Zhe Wang
A. Gruslys
Aleksandra Malysheva
Mina Khan
Sherjil Ozair
Finbarr Timbers
Tobias Pohlen
Tom Eccles
Mark Rowland
Marc Lanctot
Jean-Baptiste Lespiau
Bilal Piot
Shayegan Omidshafiei
Edward Lockhart
Laurent Sifre
Nathalie Beauguerlange
Rémi Munos
David Silver
Satinder Singh
Demis Hassabis
K. Tuyls
ArXivPDFHTML

Papers citing "Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning"

50 / 85 papers shown
Title
Human-Level Competitive Pokémon via Scalable Offline Reinforcement Learning with Transformers
Human-Level Competitive Pokémon via Scalable Offline Reinforcement Learning with Transformers
Jake Grigsby
Yuqi Xie
Justin Sasek
Steven Zheng
Yuke Zhu
OffRL
26
0
0
06 Apr 2025
Asynchronous Predictive Counterfactual Regret Minimization$^+$ Algorithm in Solving Extensive-Form Games
Asynchronous Predictive Counterfactual Regret Minimization+^++ Algorithm in Solving Extensive-Form Games
Linjian Meng
Youzhi Zhang
Zhenxing Ge
Tianpei Yang
Yang Gao
38
0
0
17 Mar 2025
On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games
Yang Cai
Gabriele Farina
Julien Grand-Clément
Christian Kroer
Chung-Wei Lee
Haipeng Luo
Weiqiang Zheng
39
0
0
04 Mar 2025
On the Interplay between Social Welfare and Tractability of Equilibria
On the Interplay between Social Welfare and Tractability of Equilibria
Ioannis Anagnostides
T. Sandholm
50
2
0
10 Jan 2025
Barriers to Welfare Maximization with No-Regret Learning
Barriers to Welfare Maximization with No-Regret Learning
Ioannis Anagnostides
Alkis Kalavasis
T. Sandholm
38
1
0
04 Nov 2024
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General
  Preferences
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Y. Liu
Argyris Oikonomou
Weiqiang Zheng
Yang Cai
Arman Cohan
40
1
0
30 Oct 2024
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
Mingzhi Wang
Chengdong Ma
Qizhi Chen
Linjian Meng
Yang Han
Jiancong Xiao
Zhaowei Zhang
Jing Huo
Weijie Su
Yaodong Yang
32
4
0
22 Oct 2024
Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and
  Reliability
Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability
Weitong Zhang
Chengqi Zang
Bernhard Kainz
31
0
0
01 Oct 2024
Identifying and Clustering Counter Relationships of Team Compositions in
  PvP Games for Efficient Balance Analysis
Identifying and Clustering Counter Relationships of Team Compositions in PvP Games for Efficient Balance Analysis
Chiu-Chou Lin
Yu-Wei Shih
Kuei-Ting Kuo
Yu-Cheng Chen
Chien-Hua Chen
Wei-Chen Chiu
I-Chen Wu
27
0
0
30 Aug 2024
In-Context Exploiter for Extensive-Form Games
In-Context Exploiter for Extensive-Form Games
Shuxin Li
Chang Yang
Youzhi Zhang
Pengdeng Li
Xinrun Wang
Xiao Huang
Hau Chan
Bo An
29
0
0
10 Aug 2024
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
Saman Kazemkhani
Aarav Pandya
Daphne Cornelisse
Brennan Shacklett
Eugene Vinitsky
44
8
0
02 Aug 2024
A Survey on Self-play Methods in Reinforcement Learning
A Survey on Self-play Methods in Reinforcement Learning
Chao Yu
Zelai Xu
Chengdong Ma
Chao Yu
Weijuan Tu
...
Deheng Ye
Wenbo Ding
Yaodong Yang
Yu Wang
Yu Wang
SyDa
SSL
OnRL
51
8
0
02 Aug 2024
Overcoming Binary Adversarial Optimisation with Competitive Coevolution
Overcoming Binary Adversarial Optimisation with Competitive Coevolution
Per Kristian Lehre
Shishen Lin
AAML
27
1
0
25 Jul 2024
Neural Network-based Information Set Weighting for Playing
  Reconnaissance Blind Chess
Neural Network-based Information Set Weighting for Playing Reconnaissance Blind Chess
Timo Bertram
Johannes Fürnkranz
Martin Müller
35
1
0
08 Jul 2024
PUZZLES: A Benchmark for Neural Algorithmic Reasoning
PUZZLES: A Benchmark for Neural Algorithmic Reasoning
Benjamin Estermann
Luca A. Lanzendörfer
Yannick Niedermayr
Roger Wattenhofer
50
2
0
29 Jun 2024
Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
Yang Cai
Gabriele Farina
Julien Grand-Clément
Christian Kroer
Chung-Wei Lee
Haipeng Luo
Weiqiang Zheng
55
6
0
15 Jun 2024
Trainability issues in quantum policy gradients
Trainability issues in quantum policy gradients
André Sequeira
Luis Paulo Santos
Luis Soares Barbosa
46
1
0
13 Jun 2024
Optimization of geological carbon storage operations with multimodal
  latent dynamic model and deep reinforcement learning
Optimization of geological carbon storage operations with multimodal latent dynamic model and deep reinforcement learning
Zhongzheng Wang
Yuntian Chen
Guodong Chen
Dongxiao Zhang
AI4CE
31
0
0
07 Jun 2024
Open-Endedness is Essential for Artificial Superhuman Intelligence
Open-Endedness is Essential for Artificial Superhuman Intelligence
Edward Hughes
Michael Dennis
Jack Parker-Holder
Feryal M. P. Behbahani
Aditi Mavalankar
Yuge Shi
Tom Schaul
Tim Rocktaschel
LRM
40
21
0
06 Jun 2024
Adaptive Advantage-Guided Policy Regularization for Offline
  Reinforcement Learning
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu
Yang Li
Yixing Lan
Hao Gao
Wei Pan
Xin Xu
OffRL
36
5
0
30 May 2024
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
64
3
0
29 May 2024
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical
  Behaviors in Deep Off-Policy RL
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
OnRL
41
2
0
28 May 2024
Mixture of Public and Private Distributions in Imperfect Information
  Games
Mixture of Public and Private Distributions in Imperfect Information Games
Jérôme Arjonilla
Abdallah Saffidine
Tristan Cazenave
24
1
0
23 May 2024
Configurable Mirror Descent: Towards a Unification of Decision Making
Configurable Mirror Descent: Towards a Unification of Decision Making
Pengdeng Li
Shuxin Li
Chang Yang
Xinrun Wang
Shuyue Hu
Xiao Huang
Hau Chan
Bo An
36
1
0
20 May 2024
A Meta-Game Evaluation Framework for Deep Multiagent Reinforcement
  Learning
A Meta-Game Evaluation Framework for Deep Multiagent Reinforcement Learning
Zun Li
Michael P. Wellman
34
1
0
30 Apr 2024
State-Constrained Zero-Sum Differential Games with One-Sided Information
State-Constrained Zero-Sum Differential Games with One-Sided Information
Mukesh Ghimire
Lei Zhang
Zhenni Xu
Yi Ren
44
2
0
05 Mar 2024
Policy Space Response Oracles: A Survey
Policy Space Response Oracles: A Survey
Ariyan Bighashdel
Yongzhao Wang
Stephen Marcus McAleer
Rahul Savani
F. Oliehoek
33
6
0
04 Mar 2024
Understanding Iterative Combinatorial Auction Designs via Multi-Agent
  Reinforcement Learning
Understanding Iterative Combinatorial Auction Designs via Multi-Agent Reinforcement Learning
G. dÉon
N. Newman
Kevin Leyton-Brown
32
0
0
29 Feb 2024
Offline Fictitious Self-Play for Competitive Games
Offline Fictitious Self-Play for Competitive Games
Jingxiao Chen
Weiji Xie
Weinan Zhang
Yong Zu
Ying Wen
OffRL
46
0
0
29 Feb 2024
RIME: Robust Preference-based Reinforcement Learning with Noisy
  Preferences
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Jie Cheng
Gang Xiong
Xingyuan Dai
Q. Miao
Yisheng Lv
Fei-Yue Wang
33
15
0
27 Feb 2024
Experiments with Encoding Structured Data for Neural Networks
Experiments with Encoding Structured Data for Neural Networks
Sujay Nagesh Koujalgi
Jonathan Dodge
19
0
0
15 Feb 2024
SwarmBrain: Embodied agent for real-time strategy game StarCraft II via
  large language models
SwarmBrain: Embodied agent for real-time strategy game StarCraft II via large language models
Xiao Shao
Weifu Jiang
Fei Zuo
Mengqing Liu
LLMAG
33
7
0
31 Jan 2024
Symbolic Equation Solving via Reinforcement Learning
Symbolic Equation Solving via Reinforcement Learning
Lennart Dabelow
Masahito Ueda
53
2
0
24 Jan 2024
CivRealm: A Learning and Reasoning Odyssey in Civilization for
  Decision-Making Agents
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Siyuan Qi
Shuo Chen
Yexin Li
Xiangyu Kong
Junqi Wang
...
Zhaowei Zhang
Nian Liu
Wei Wang
Yaodong Yang
Song-Chun Zhu
AI4CE
LRM
27
18
0
19 Jan 2024
Exploiting hidden structures in non-convex games for convergence to Nash
  equilibrium
Exploiting hidden structures in non-convex games for convergence to Nash equilibrium
Iosif Sakos
Emmanouil-Vasileios Vlatakis-Gkaragkounis
P. Mertikopoulos
Georgios Piliouras
19
5
0
27 Dec 2023
Optimistic Policy Gradient in Multi-Player Markov Games with a Single
  Controller: Convergence Beyond the Minty Property
Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence Beyond the Minty Property
Ioannis Anagnostides
Ioannis Panageas
Gabriele Farina
T. Sandholm
33
3
0
19 Dec 2023
An Invitation to Deep Reinforcement Learning
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRL
OOD
78
5
0
13 Dec 2023
Computing Perfect Bayesian Equilibria in Sequential Auctions with Verification
Computing Perfect Bayesian Equilibria in Sequential Auctions with Verification
Vinzenz Thoma
Vitor Bosshard
Sven Seuken
31
1
0
07 Dec 2023
Honesty Is the Best Policy: Defining and Mitigating AI Deception
Honesty Is the Best Policy: Defining and Mitigating AI Deception
Francis Rhys Ward
Francesco Belardinelli
Francesca Toni
Tom Everitt
110
27
0
03 Dec 2023
Nash Learning from Human Feedback
Nash Learning from Human Feedback
Rémi Munos
Michal Valko
Daniele Calandriello
M. G. Azar
Mark Rowland
...
Nikola Momchev
Olivier Bachem
D. Mankowitz
Doina Precup
Bilal Piot
42
125
0
01 Dec 2023
Guarantees for Self-Play in Multiplayer Games via Polymatrix
  Decomposability
Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability
Revan MacQueen
James R. Wright
26
2
0
17 Oct 2023
The Consensus Game: Language Model Generation via Equilibrium Search
The Consensus Game: Language Model Generation via Equilibrium Search
Athul Paul Jacob
Songlin Yang
Gabriele Farina
Jacob Andreas
39
19
0
13 Oct 2023
Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey
Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey
Gregory Palmer
Chris Parry
Daniel J.B. Harrold
Chris Willis
AI4CE
21
1
0
11 Oct 2023
Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with
  Subgame Curriculum Learning
Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning
Jiayu Chen
Zelai Xu
Yunfei Li
Chao Yu
Jiaming Song
Huazhong Yang
Fei Fang
Yu Wang
Yi Wu
29
4
0
07 Oct 2023
Avalon's Game of Thoughts: Battle Against Deception through Recursive
  Contemplation
Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation
Shenzhi Wang
Chang Liu
Zilong Zheng
Siyuan Qi
Shuo Chen
Qisen Yang
Andrew Zhao
Chaofei Wang
Shiji Song
Gao Huang
LLMAG
37
63
0
02 Oct 2023
Smooth Nash Equilibria: Algorithms and Complexity
Smooth Nash Equilibria: Algorithms and Complexity
C. Daskalakis
Noah Golowich
Nika Haghtalab
Abhishek Shetty
25
4
0
21 Sep 2023
Efficient Last-iterate Convergence Algorithms in Solving Games
Efficient Last-iterate Convergence Algorithms in Solving Games
Lin Meng
Zhenxing Ge
Wenbin Li
Bo An
Yang Gao
Wenbin Li
Tianpei Yang
Bo An
Yang Gao
15
0
0
22 Aug 2023
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player
  Zero-Sum Games
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games
Yang Li
Kun Xiong
Yingping Zhang
Jiangcheng Zhu
Stephen Marcus McAleer
Wei Pan
Jun Wang
Zonghong Dai
Yaodong Yang
39
2
0
09 Aug 2023
Stability of Multi-Agent Learning: Convergence in Network Games with
  Many Players
Stability of Multi-Agent Learning: Convergence in Network Games with Many Players
A. Hussain
D. Leonte
Francesco Belardinelli
Georgios Piliouras
MLT
21
0
0
26 Jul 2023
Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled
  Perturbations
Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations
Yongyuan Liang
Yanchao Sun
Ruijie Zheng
Xiangyu Liu
Benjamin Eysenbach
T. Sandholm
Furong Huang
Stephen Marcus McAleer
OOD
43
0
0
22 Jul 2023
12
Next