ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.01815
  4. Cited By
Mastering Chess and Shogi by Self-Play with a General Reinforcement
  Learning Algorithm

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

5 December 2017
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
A. Guez
Marc Lanctot
Laurent Sifre
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
ArXivPDFHTML

Papers citing "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm"

50 / 207 papers shown
Title
CubeTR: Learning to Solve The Rubiks Cube Using Transformers
Mustafa Chasmai
ViT
26
1
0
11 Nov 2021
AlphaD3M: Machine Learning Pipeline Synthesis
AlphaD3M: Machine Learning Pipeline Synthesis
Iddo Drori
Yamuna Krishnamurthy
Rémi Rampin
Raoni Lourenço
Jorge Piazentin Ono
Kyunghyun Cho
Claudio Silva
J. Freire
14
85
0
03 Nov 2021
Adaptive Discretization in Online Reinforcement Learning
Adaptive Discretization in Online Reinforcement Learning
Sean R. Sinclair
Siddhartha Banerjee
C. Yu
OffRL
40
15
0
29 Oct 2021
Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment
Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment
T. Luu
Chang-Dong Yoo
10
8
0
28 Oct 2021
Measuring the Non-Transitivity in Chess
Measuring the Non-Transitivity in Chess
R. Sanjaya
Jun Wang
Yaodong Yang
11
22
0
22 Oct 2021
In a Nutshell, the Human Asked for This: Latent Goals for Following
  Temporal Specifications
In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications
Borja G. Leon
Murray Shanahan
Francesco Belardinelli
AI4CE
23
15
0
18 Oct 2021
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement
  Learning
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning
Edoardo Cetin
Oya Celiktutan
OffRL
39
16
0
07 Oct 2021
Deep Synoptic Monte Carlo Planning in Reconnaissance Blind Chess
Deep Synoptic Monte Carlo Planning in Reconnaissance Blind Chess
Gregory Clark
25
9
0
05 Oct 2021
Reinforcement Learning with Information-Theoretic Actuation
Reinforcement Learning with Information-Theoretic Actuation
Elliot Catt
Marcus Hutter
J. Veness
39
0
0
30 Sep 2021
Learning General Optimal Policies with Graph Neural Networks: Expressive
  Power, Transparency, and Limits
Learning General Optimal Policies with Graph Neural Networks: Expressive Power, Transparency, and Limits
Simon Ståhlberg
Blai Bonet
Hector Geffner
38
48
0
21 Sep 2021
Target Languages (vs. Inductive Biases) for Learning to Act and Plan
Target Languages (vs. Inductive Biases) for Learning to Act and Plan
Hector Geffner
39
6
0
15 Sep 2021
On Solving a Stochastic Shortest-Path Markov Decision Process as
  Probabilistic Inference
On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference
Mohamed Baioumy
Bruno Lacerda
Paul Duckworth
Nick Hawes
22
3
0
13 Sep 2021
Explaining Bayesian Neural Networks
Explaining Bayesian Neural Networks
Kirill Bykov
Marina M.-C. Höhne
Adelaida Creosteanu
Klaus-Robert Muller
Frederick Klauschen
Shinichi Nakajima
Marius Kloft
BDL
AAML
31
25
0
23 Aug 2021
Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control
Dimitri Bertsekas
AI4CE
40
55
0
20 Aug 2021
Train on Small, Play the Large: Scaling Up Board Games with AlphaZero
  and GNN
Train on Small, Play the Large: Scaling Up Board Games with AlphaZero and GNN
Shai Ben-Assayag
Ran El-Yaniv
GNN
27
9
0
18 Jul 2021
Improve Agents without Retraining: Parallel Tree Search with Off-Policy
  Correction
Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction
Assaf Hallak
Gal Dalal
Steven Dalton
I. Frosio
Shie Mannor
Gal Chechik
OffRL
OnRL
35
9
0
04 Jul 2021
Continuous Control with Deep Reinforcement Learning for Autonomous
  Vessels
Continuous Control with Deep Reinforcement Learning for Autonomous Vessels
Nader Zare
Bruno Brandoli
Mahtab Sarvmaili
Amílcar Soares
Stan Matwin
11
8
0
27 Jun 2021
Policy Smoothing for Provably Robust Reinforcement Learning
Policy Smoothing for Provably Robust Reinforcement Learning
Aounon Kumar
Alexander Levine
S. Feizi
AAML
15
54
0
21 Jun 2021
Graceful Degradation and Related Fields
Graceful Degradation and Related Fields
J. Dymond
31
4
0
21 Jun 2021
Hierarchical RNNs-Based Transformers MADDPG for Mixed
  Cooperative-Competitive Environments
Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive Environments
Xiaolong Wei
Lifang Yang
Xianglin Huang
Gang Cao
Zhulin Tao
Zhengyang Du
Jing An
21
6
0
11 May 2021
Sifting out the features by pruning: Are convolutional networks the
  winning lottery ticket of fully connected ones?
Sifting out the features by pruning: Are convolutional networks the winning lottery ticket of fully connected ones?
Franco Pellegrini
Giulio Biroli
49
6
0
27 Apr 2021
Qubit Routing using Graph Neural Network aided Monte Carlo Tree Search
Qubit Routing using Graph Neural Network aided Monte Carlo Tree Search
Animesh Sinha
Utkarsh Azad
Harjinder Singh
39
27
0
01 Apr 2021
Self-adaptive Torque Vectoring Controller Using Reinforcement Learning
Self-adaptive Torque Vectoring Controller Using Reinforcement Learning
Shayan Taherian
Sampo Kuutti
Marco Visca
Saber Fallah
6
4
0
27 Mar 2021
Policy-Guided Heuristic Search with Guarantees
Policy-Guided Heuristic Search with Guarantees
Laurent Orseau
Levi H. S. Lelis
18
26
0
21 Mar 2021
Neural Networks and Denotation
Neural Networks and Denotation
E. Allen
17
0
0
15 Mar 2021
Sample-efficient Reinforcement Learning Representation Learning with
  Curiosity Contrastive Forward Dynamics Model
Sample-efficient Reinforcement Learning Representation Learning with Curiosity Contrastive Forward Dynamics Model
Thanh Nguyen
T. Luu
Thang Vu
Chang-Dong Yoo
15
17
0
15 Mar 2021
Learning to run a Power Network Challenge: a Retrospective Analysis
Learning to run a Power Network Challenge: a Retrospective Analysis
Antoine Marot
Benjamin Donnot
Gabriel Dulac-Arnold
A. Kelly
A. O'Sullivan
J. Viebahn
M. Awad
Isabelle M Guyon
P. Panciatici
Camilo Romero
9
77
0
02 Mar 2021
PerSim: Data-Efficient Offline Reinforcement Learning with Heterogeneous
  Agents via Personalized Simulators
PerSim: Data-Efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators
Anish Agarwal
Abdullah Alomar
Varkey Alumootil
Devavrat Shah
Dennis Shen
Zhi Xu
Cindy Yang
OffRL
18
18
0
13 Feb 2021
Deep Reinforcement Learning for the Control of Robotic Manipulation: A
  Focussed Mini-Review
Deep Reinforcement Learning for the Control of Robotic Manipulation: A Focussed Mini-Review
Rongrong Liu
F. Nageotte
P. Zanne
M. de Mathelin
Birgitta Dresp
42
143
0
08 Feb 2021
Differentiable Trust Region Layers for Deep Reinforcement Learning
Differentiable Trust Region Layers for Deep Reinforcement Learning
Fabian Otto
P. Becker
Ngo Anh Vien
Hanna Ziesche
Gerhard Neumann
OffRL
28
19
0
22 Jan 2021
Asymmetric self-play for automatic goal discovery in robotic
  manipulation
Asymmetric self-play for automatic goal discovery in robotic manipulation
OpenAI OpenAI
Matthias Plappert
Raul Sampedro
Tao Xu
Ilge Akkaya
...
Hyeonwoo Noh
Lilian Weng
Qiming Yuan
Casey Chu
Wojciech Zaremba
SSL
76
76
0
13 Jan 2021
Open Problems in Cooperative AI
Open Problems in Cooperative AI
Allan Dafoe
Edward Hughes
Yoram Bachrach
Tantum Collins
Kevin R. McKee
Joel Z. Leibo
Kate Larson
T. Graepel
24
199
0
15 Dec 2020
Relative Variational Intrinsic Control
Relative Variational Intrinsic Control
Kate Baumli
David Warde-Farley
S. Hansen
Volodymyr Mnih
18
42
0
14 Dec 2020
Hindsight and Sequential Rationality of Correlated Play
Hindsight and Sequential Rationality of Correlated Play
Dustin Morrill
Ryan DÓrazio
Reca Sarfati
Marc Lanctot
James Wright
A. Greenwald
Michael H. Bowling
21
30
0
10 Dec 2020
Ensemble Squared: A Meta AutoML System
Ensemble Squared: A Meta AutoML System
Jason Yoo
Tony Joseph
Dylan Yung
S. Nasseri
Frank D. Wood
MoE
19
8
0
10 Dec 2020
EvoCraft: A New Challenge for Open-Endedness
EvoCraft: A New Challenge for Open-Endedness
Djordje Grbic
Rasmus Berg Palm
Elias Najarro
Claire Glanois
S. Risi
11
30
0
08 Dec 2020
Deep Reinforcement Learning for Resource Constrained Multiclass
  Scheduling in Wireless Networks
Deep Reinforcement Learning for Resource Constrained Multiclass Scheduling in Wireless Networks
Apostolos Avranas
Marios Kountouris
P. Ciblat
19
7
0
27 Nov 2020
Experimental design for MRI by greedy policy search
Experimental design for MRI by greedy policy search
Tim Bakker
H. V. Hoof
Max Welling
27
55
0
30 Oct 2020
SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for
  Autonomous Driving
SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving
Ming Zhou
Jun-Jie Luo
Julian Villela
Yaodong Yang
David Rusu
...
H. Ammar
Hongbo Zhang
Wulong Liu
Jianye Hao
Jun Wang
139
193
0
19 Oct 2020
Learning to Play Two-Player Perfect-Information Games without Knowledge
Learning to Play Two-Player Perfect-Information Games without Knowledge
Quentin Cohen-Solal
OffRL
30
13
0
03 Aug 2020
Monte-Carlo Tree Search as Regularized Policy Optimization
Monte-Carlo Tree Search as Regularized Policy Optimization
Jean-Bastien Grill
Florent Altché
Yunhao Tang
Thomas Hubert
Michal Valko
Ioannis Antonoglou
Rémi Munos
19
73
0
24 Jul 2020
Rinascimento: using event-value functions for playing Splendor
Rinascimento: using event-value functions for playing Splendor
Ivan Bravi
Simon Lucas
14
2
0
10 Jun 2020
POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with
  Non-Asymptotic Analysis
POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis
Weichao Mao
K. Zhang
Qiaomin Xie
Tamer Basar
11
14
0
08 Jun 2020
Reassessing Claims of Human Parity and Super-Human Performance in
  Machine Translation at WMT 2019
Reassessing Claims of Human Parity and Super-Human Performance in Machine Translation at WMT 2019
Antonio Toral
11
43
0
12 May 2020
How Do You Act? An Empirical Study to Understand Behavior of Deep
  Reinforcement Learning Agents
How Do You Act? An Empirical Study to Understand Behavior of Deep Reinforcement Learning Agents
Richard Meyes
Moritz Schneider
Tobias Meisen
26
2
0
07 Apr 2020
A Survey of End-to-End Driving: Architectures and Training Methods
A Survey of End-to-End Driving: Architectures and Training Methods
Ardi Tampuu
Maksym Semikin
Naveed Muhammad
D. Fishman
Tambet Matiisen
3DV
23
228
0
13 Mar 2020
On Reinforcement Learning for Turn-based Zero-sum Markov Games
On Reinforcement Learning for Turn-based Zero-sum Markov Games
Devavrat Shah
Varun Somani
Qiaomin Xie
Zhi Xu
13
11
0
25 Feb 2020
Machine Learning in Python: Main developments and technology trends in
  data science, machine learning, and artificial intelligence
Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence
S. Raschka
Joshua Patterson
Corey J. Nolet
AI4CE
21
482
0
12 Feb 2020
Reinforcement Learning for POMDP: Partitioned Rollout and Policy
  Iteration with Application to Autonomous Sequential Repair Problems
Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems
Sushmita Bhattacharya
Sahil Badyal
Thomas Wheeler
Stephanie Gil
Dimitri Bertsekas
19
33
0
11 Feb 2020
Taming an autonomous surface vehicle for path following and collision
  avoidance using deep reinforcement learning
Taming an autonomous surface vehicle for path following and collision avoidance using deep reinforcement learning
Eivind Meyer
Haakon Robinson
Adil Rasheed
Omer San
17
65
0
18 Dec 2019
Previous
12345
Next