Title
DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning Youpeng Zhao Jian Zhao Xu Hu Wen-gang Zhou Houqiang Li 60 15 0 06 Apr 2022
PerfectDou: Dominating DouDizhu with Perfect Information Distillation Yang Guan Minghuan Liu Weijun Hong Weinan Zhang Fei Fang Guangjun Zeng Yue Lin 119 28 0 30 Mar 2022
On the link between conscious function and general intelligence in humans and machines Arthur Juliani Kai Arulkumaran Shuntaro Sasai Ryota Kanai 103 26 0 24 Mar 2022
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games Mathieu Laurière Sarah Perrin Sertan Girgin Paul Muller Ayush Jain ... Georgios Piliouras Julien Pérolat Romuald Élie Olivier Pietquin Matthieu Geist 99 44 0 22 Mar 2022
Self-Imitation Learning from Demonstrations Georgiy Pshikhachev Dmitry Ivanov Vladimir Egorov A. Shpilman 54 6 0 21 Mar 2022
Optimal Correlated Equilibria in General-Sum Extensive-Form Games: Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation B. Zhang Gabriele Farina A. Celli Tuomas Sandholm 81 22 0 14 Mar 2022
Generalized Bandit Regret Minimizer Framework in Imperfect Information Extensive-Form Game Lin Meng Yang Gao 129 1 0 11 Mar 2022
On-the-fly Strategy Adaptation for ad-hoc Agent Coordination Jaleh Zand Jack Parker-Holder Stephen J. Roberts 55 14 0 08 Mar 2022
Solving optimization problems with Blackwell approachability Julien Grand-Clément Christian Kroer 40 5 0 24 Feb 2022
Reinforcement Learning in Practice: Opportunities and Challenges Yuxi Li OffRL 64 9 0 23 Feb 2022
A History of Meta-gradient: Gradient Methods for Meta-learning R. Sutton 61 11 0 20 Feb 2022
Near-Optimal Learning of Extensive-Form Games with Imperfect Information Yunru Bai Chi Jin Song Mei Tiancheng Yu 104 26 0 03 Feb 2022
Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games Gabriele Farina Chung-Wei Lee Haipeng Luo Christian Kroer 46 32 0 01 Feb 2022
DecisionHoldem: Safe Depth-Limited Solving With Diverse Opponents for Imperfect-Information Games Qibin Zhou Dongdong Bai Junge Zhang Fuqing Duan Kaiqi Huang 18 2 0 27 Jan 2022
Public Information Representation for Adversarial Team Games Luca Carminati Federico Cacciamani Marco Ciccone N. Gatti 36 9 0 25 Jan 2022
NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search Wanqi Xue Bo An C. Yeo 8 4 0 17 Jan 2022
Revisiting Game Representations: The Hidden Costs of Efficiency in Sequential Decision-making Algorithms Vojtěch Kovařík David Milec Michal Sustr Dominik Seitz Viliam Lisý 28 0 0 20 Dec 2021
Modeling Strong and Human-Like Gameplay with KL-Regularized Search Athul Paul Jacob David J. Wu Gabriele Farina Adam Lerer Hengyuan Hu A. Bakhtin Jacob Andreas Noam Brown 60 54 0 14 Dec 2021
Student of Games: A unified learning algorithm for both perfect and imperfect information games Martin Schmid Matej Moravcík Neil Burch Rudolf Kadlec Josh Davidson ... Marc Lanctot G. Z. Holland Elnaz Davoodi Alden Christianson Michael Bowling 86 22 0 06 Dec 2021
On the complexity of Dark Chinese Chess Cong Wang Tongwei Lu 57 0 0 06 Dec 2021
Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text Christopher Clark Jordi Salvador Dustin Schwenk Derrick Bonafilia Mark Yatskar ... Aaron Sarnat Hannaneh Hajishirzi Aniruddha Kembhavi Oren Etzioni Ali Farhadi MLLM 52 5 0 01 Dec 2021
Beyond Time-Average Convergence: Near-Optimal Uncoupled Online Learning via Clairvoyant Multiplicative Weights Update Georgios Piliouras Ryann Sim Stratis Skoulakis 105 23 0 29 Nov 2021
AI in Human-computer Gaming: Techniques, Challenges and Opportunities Qiyue Yin Jun Yang Kaiqi Huang Meijing Zhao Wancheng Ni Bin Liang Yan Huang Shu Wu Liangsheng Wang 54 21 0 15 Nov 2021
Near-Optimal No-Regret Learning for Correlated Equilibria in Multi-Player General-Sum Games Ioannis Anagnostides C. Daskalakis Gabriele Farina Maxwell Fishelson Noah Golowich Tuomas Sandholm 153 56 0 11 Nov 2021
Towards convergence to Nash equilibria in two-team zero-sum games Fivos Kalogiannis Ioannis Panageas Emmanouil-Vasileios Vlatakis-Gkaragkounis 55 5 0 07 Nov 2021
Learning Diverse Policies in MOBA Games via Macro-Goals Yiming Gao Bei Shi Xueying Du Liang Wang Guangwei Chen ... Weixuan Wang Deheng Ye Qiang Fu Wei Yang Lanxiao Huang 76 11 0 27 Oct 2021
HSVI for zs-POSGs using Concavity, Convexity and Lipschitz Properties Aurélien Delage Olivier Buffet J. Dibangoye 92 3 0 25 Oct 2021
Independent Natural Policy Gradient Always Converges in Markov Potential Games Roy Fox Stephen Marcus McAleer W. Overman Ioannis Panageas 90 49 0 20 Oct 2021
Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent Weiming Liu Huacong Jiang Bin Li Houqiang Li 47 10 0 11 Oct 2021
Deep Synoptic Monte Carlo Planning in Reconnaissance Blind Chess Gregory Clark 84 9 0 05 Oct 2021
Scalable Online Planning via Reinforcement Learning Fine-Tuning Arnaud Fickinger Hengyuan Hu Brandon Amos Stuart J. Russell Noam Brown 97 21 0 30 Sep 2021
Generalization in Mean Field Games by Learning Master Policies Sarah Perrin Mathieu Laurière Julien Pérolat Romuald Élie Matthieu Geist Olivier Pietquin AI4CE 147 37 0 20 Sep 2021
Temporal Induced Self-Play for Stochastic Bayesian Games Weizhe Chen Zihan Zhou Yi Wu Fei Fang 23 3 0 21 Aug 2021
How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review Florian Tambon Gabriel Laberge Le An Amin Nikanjam Paulina Stevia Nouwou Mindom Y. Pequignot Foutse Khomh G. Antoniol E. Merlo François Laviolette 104 70 0 26 Jul 2021
Going Beyond Linear RL: Sample Efficient Neural Function Approximation Baihe Huang Kaixuan Huang Sham Kakade Jason D. Lee Qi Lei Runzhe Wang Jiaqi Yang 99 8 0 14 Jul 2021
Deep Multiagent Reinforcement Learning: Challenges and Directions Annie Wong Thomas Bäck Anna V. Kononova Aske Plaat AI4CE 116 97 0 29 Jun 2021
Evolutionary Dynamics and $Φ$ -Regret Minimization in Games Georgios Piliouras Mark Rowland Shayegan Omidshafiei Romuald Elie Daniel Hennes Jerome T. Connor K. Tuyls 56 2 0 28 Jun 2021
Last-iterate Convergence in Extensive-Form Games Chung-Wei Lee Christian Kroer Haipeng Luo 190 40 0 27 Jun 2021
Post-Selections in AI and How to Avoid Them J. Weng 44 1 0 19 Jun 2021
MADE: Exploration via Maximizing Deviation from Explored Regions Tianjun Zhang Paria Rashidinejad Jiantao Jiao Yuandong Tian Joseph E. Gonzalez Stuart J. Russell OffRL 96 43 0 18 Jun 2021
Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings Hengyuan Hu Adam Lerer Noam Brown Jakob N. Foerster 113 20 0 16 Jun 2021
DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning Daochen Zha Jingru Xie Wenye Ma Sheng Zhang Xiangru Lian Helen Zhou Ji Liu 71 117 0 11 Jun 2021
Vector Quantized Models for Planning Sherjil Ozair Yazhe Li Ali Razavi Ioannis Antonoglou Aaron van den Oord Oriol Vinyals OffRL 92 51 0 08 Jun 2021
Dynamic Sparse Training for Deep Reinforcement Learning Ghada Sokar Elena Mocanu Decebal Constantin Mocanu Mykola Pechenizkiy Peter Stone 106 59 0 08 Jun 2021
Improving Social Welfare While Preserving Autonomy via a Pareto Mediator Stephen Marcus McAleer John Lanier Michael Dennis Pierre Baldi Roy Fox 45 4 0 07 Jun 2021
Bottom-up and top-down approaches for the design of neuromorphic processing systems: Tradeoffs and synergies between natural and artificial intelligence Charlotte Frenkel D. Bol Giacomo Indiveri 85 36 0 02 Jun 2021
Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point Solving Julien Grand-Clément Christian Kroer 35 5 0 27 May 2021
Better Regularization for Sequential Decision Spaces: Fast Convergence Rates for Nash, Correlated, and Team Equilibria Gabriele Farina Christian Kroer Tuomas Sandholm 58 26 0 27 May 2021
D2CFR: Minimize Counterfactual Regret with Deep Dueling Neural Network Huale Li Xuan Wang Zengyue Guo Jia-jia Zhang Shuhan Qi 34 1 0 26 May 2021
From Motor Control to Team Play in Simulated Humanoid Football Siqi Liu Guy Lever Zhe Wang J. Merel S. M. Ali Eslami ... Tuomas Haarnoja Brendan D. Tracey K. Tuyls T. Graepel N. Heess 117 134 0 25 May 2021