Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1712.01815
Cited By
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
5 December 2017
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
A. Guez
Marc Lanctot
Laurent Sifre
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm"
50 / 839 papers shown
Do Zombies Understand? A Choose-Your-Own-Adventure Exploration of Machine Cognition
Ariel Goldstein
Gabriel Stanovsky
205
2
0
01 Mar 2024
Understanding Iterative Combinatorial Auction Designs via Multi-Agent Reinforcement Learning
G. dÉon
N. Newman
Kevin Leyton-Brown
176
1
0
29 Feb 2024
Offline Fictitious Self-Play for Competitive Games
Jingxiao Chen
Weiji Xie
W. Zhang
Yong Zu
Ying Wen
OffRL
223
0
0
29 Feb 2024
Impact of Computation in Integral Reinforcement Learning for Continuous-Time Control
Wenhan Cao
Wei Pan
214
1
0
27 Feb 2024
Rigor with Machine Learning from Field Theory to the Poincaré Conjecture
Sergei Gukov
James Halverson
Fabian Ruehle
AI4CE
140
22
0
20 Feb 2024
Puzzle Solving using Reasoning of Large Language Models: A Survey
Panagiotis Giadikiaroglou
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
ELM
ReLM
LRM
378
52
0
17 Feb 2024
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Huizhuo Yuan
Zixiang Chen
Kaixuan Ji
Quanquan Gu
233
63
0
15 Feb 2024
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Alex Havrilla
Sharath Raparthy
Christoforus Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Roberta Railneau
ReLM
LRM
280
96
0
13 Feb 2024
Large Language Models as Agents in Two-Player Games
Yang Liu
Yang Liu
Hang Li
LLMAG
183
7
0
12 Feb 2024
Scaling Intelligent Agents in Combat Simulations for Wargaming
Scotty Black
Christian J. Darken
46
3
0
08 Feb 2024
Scaling Artificial Intelligence for Digital Wargaming in Support of Decision-Making
Scotty Black
Christian J. Darken
122
2
0
08 Feb 2024
Grandmaster-Level Chess Without Search
Anian Ruoss
Grégoire Delétang
Sourabh Medapati
Jordi Grau-Moya
Wenliang Kevin Li
Elliot Catt
John Reid
Tim Genewein
LRM
231
7
0
07 Feb 2024
A Multi-step Loss Function for Robust Learning of the Dynamics in Model-based Reinforcement Learning
Khyati Khandelwal
Albert Thomas
Giuseppe Paolo
Maurizio Filippone
Jun Yao
NoLa
164
3
0
05 Feb 2024
Deep autoregressive density nets vs neural ensembles for model-based offline reinforcement learning
Khyati Khandelwal
Albert Thomas
Jun Yao
OffRL
214
2
0
05 Feb 2024
The RL/LLM Taxonomy Tree: Reviewing Synergies Between Reinforcement Learning and Large Language Models
M. Pternea
Prerna Singh
Abir Chakraborty
Y. Oruganti
M. Milletarí
Sayli Bapat
Kebei Jiang
OffRL
252
24
0
02 Feb 2024
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Marius-Constantin Dinu
Claudiu Leoveanu-Condrei
Markus Holzleitner
Werner Zellinger
Sepp Hochreiter
320
18
0
01 Feb 2024
Layered and Staged Monte Carlo Tree Search for SMT Strategy Synthesis
Zhengyang Lu
Stefan Siemer
Piyush Jha
Joel D. Day
Florin Manea
Vijay Ganesh
100
13
0
30 Jan 2024
CNN architecture extraction on edge GPU
Péter Horváth
Lukasz Chmielewski
Léo Weissbart
L. Batina
Y. Yarom
MLAU
156
8
0
24 Jan 2024
AlphaMapleSAT: An MCTS-based Cube-and-Conquer SAT Solver for Hard Combinatorial Problems
Piyush Jha
Zhengyu Li
Zhengyang Lu
Curtis Bright
Vijay Ganesh
Vijay Ganesh
141
5
0
24 Jan 2024
Deep Learning Based Simulators for the Phosphorus Removal Process Control in Wastewater Treatment via Deep Reinforcement Learning Algorithms
Engineering applications of artificial intelligence (EAAI), 2024
Esmaeel Mohammadi
Mikkel Stokholm-Bjerregaard
A. A. Hansen
Per Halkjaer Nielsen
D. O. Arroyo
Petar Durdevic
AI4CE
131
18
0
23 Jan 2024
Retrieval-Guided Reinforcement Learning for Boolean Circuit Minimization
International Conference on Learning Representations (ICLR), 2024
A. B. Chowdhury
Marco Romanelli
Benjamin Tan
Ramesh Karri
Siddharth Garg
195
14
0
22 Jan 2024
VQC-Based Reinforcement Learning with Data Re-uploading: Performance and Trainability
Quantum Machine Intelligence (QMI), 2024
Rodrigo Coelho
André Sequeira
Luis Paulo Santos
224
17
0
21 Jan 2024
Learning a Prior for Monte Carlo Search by Replaying Solutions to Combinatorial Problems
Tristan Cazenave
88
2
0
19 Jan 2024
Generalized Nested Rollout Policy Adaptation with Limited Repetitions
Tristan Cazenave
105
4
0
18 Jan 2024
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
International Conference on Machine Learning (ICML), 2024
Zixiang Chen
Yihe Deng
Huizhuo Yuan
Kaixuan Ji
Quanquan Gu
SyDa
563
448
0
02 Jan 2024
Towards Cognitive AI Systems: a Survey and Prospective on Neuro-Symbolic AI
Zishen Wan
Che-Kai Liu
Hanchen Yang
Chaojian Li
Haoran You
Yonggan Fu
Cheng Wan
Tushar Krishna
Yingyan Lin
A. Raychowdhury
AAML
296
35
0
02 Jan 2024
HiER: Highlight Experience Replay for Boosting Off-Policy Reinforcement Learning Agents
IEEE Access (IEEE Access), 2023
Dániel Horváth
Jesús Bujalance Martín
Ferenc Gàbor Erdos
Z. Istenes
Fabien Moutarde
OffRL
219
3
0
14 Dec 2023
Assessing SATNet's Ability to Solve the Symbol Grounding Problem
Neural Information Processing Systems (NeurIPS), 2023
Oscar Chang
Lampros Flokas
Hod Lipson
Michael Spranger
NAI
187
24
0
13 Dec 2023
BarraCUDA: GPUs do Leak DNN Weights
Péter Horváth
Lukasz Chmielewski
Léo Weissbart
L. Batina
Y. Yarom
292
0
0
12 Dec 2023
DiSK: A Diffusion Model for Structured Knowledge
O. Kitouni
Niklas Nolte
James Hensman
Bhaskar Mitra
DiffM
181
11
0
08 Dec 2023
FoMo Rewards: Can we cast foundation models as reward functions?
Ekdeep Singh Lubana
Johann Brehmer
P. D. Haan
Taco S. Cohen
OffRL
LRM
244
4
0
06 Dec 2023
Modular Control Architecture for Safe Marine Navigation: Reinforcement Learning and Predictive Safety Filters
Aksel Vaaler
Svein Jostein Husa
Daniel Menges
T. N. Larsen
Adil Rasheed
293
2
0
04 Dec 2023
Extreme Event Prediction with Multi-agent Reinforcement Learning-based Parametrization of Atmospheric and Oceanic Turbulence
R. Mojgani
Daniel Waelchli
Yifei Guan
Petros Koumoutsakos
Pedram Hassanzadeh
AI4Cl
AI4CE
235
6
0
01 Dec 2023
Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play
Adaptive Agents and Multi-Agent Systems (AAMAS), 2023
Daniel Bairamian
Philippe Marcotte
Joshua Romoff
Gabriel Robert
Derek Nowrouzezahrai
202
1
0
28 Nov 2023
From Images to Connections: Can DQN with GNNs learn the Strategic Game of Hex?
Yannik Keller
Johannes Czech
Gopika Sudhakaran
Kristian Kersting
GNN
203
1
0
22 Nov 2023
ADAPTER-RL: Adaptation of Any Agent using Reinforcement Learning
Yi-Fan Jin
Greg Slabaugh
Simon Lucas
OnRL
AI4CE
139
2
0
20 Nov 2023
Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts
Ahmed Hendawy
Jan Peters
Carlo DÉramo
MoE
234
38
0
19 Nov 2023
Runtime Verification of Learning Properties for Reinforcement Learning Algorithms
T. Mannucci
Julio de Oliveira Filho
OffRL
111
0
0
16 Nov 2023
A Simple Solution for Offline Imitation from Observations and Examples with Possibly Incomplete Trajectories
Neural Information Processing Systems (NeurIPS), 2023
Kai Yan
Alex Schwing
Yu-Xiong Wang
OffRL
283
6
0
02 Nov 2023
Learning to Play Chess from Textbooks (LEAP): a Corpus for Evaluating Chess Moves based on Sentiment Analysis
Haifa Alrdahi
Riza Batista-Navarro
195
2
0
31 Oct 2023
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
Neural Information Processing Systems (NeurIPS), 2023
Shenao Zhang
Boyi Liu
Zhaoran Wang
Tuo Zhao
278
4
0
30 Oct 2023
Metric Flows with Neural Networks
James Halverson
Fabian Ruehle
194
9
0
30 Oct 2023
Explaining the Decisions of Deep Policy Networks for Robotic Manipulations
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2021
Seongun Kim
Jaesik Choi
162
4
0
30 Oct 2023
Optimal Robotic Assembly Sequence Planning: A Sequential Decision-Making Approach
Kartik Nagpal
Negar Mehr
310
1
0
26 Oct 2023
ACES: Generating Diverse Programming Puzzles with with Autotelic Generative Models
Julien Pourcel
Cédric Colas
Gaia Molinaro
Pierre-Yves Oudeyer
Laetitia Teodorescu
582
5
0
15 Oct 2023
Alpha Elimination: Using Deep Reinforcement Learning to Reduce Fill-In during Sparse Matrix Decomposition
Arpan Dasgupta
Kiran Ravish
142
1
0
15 Oct 2023
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
Neural Information Processing Systems (NeurIPS), 2023
Yazhe Niu
Yuan Pu
Zhenjie Yang
Xueyan Li
Tong Zhou
Jiyuan Ren
Shuai Hu
Jiaming Song
Yu Liu
382
20
0
12 Oct 2023
Measuring Feature Sparsity in Language Models
Mingyang Deng
Lucas Tao
Joe Benton
237
2
0
11 Oct 2023
f
f
f
-Policy Gradients: A General Framework for Goal Conditioned RL using
f
f
f
-Divergences
Neural Information Processing Systems (NeurIPS), 2023
Siddhant Agarwal
Ishan Durugkar
Peter Stone
Amy Zhang
263
12
0
10 Oct 2023
BridgeHand2Vec Bridge Hand Representation
European Conference on Artificial Intelligence (ECAI), 2023
Anna Sztyber-Betley
Filip Kolodziej
Jan Betley
Piotr Duszak
GAN
128
0
0
10 Oct 2023
Previous
1
2
3
...
5
6
7
...
15
16
17
Next
Page 6 of 17
Page
of 17
Go