ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.01815
  4. Cited By
Mastering Chess and Shogi by Self-Play with a General Reinforcement
  Learning Algorithm

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

5 December 2017
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
A. Guez
Marc Lanctot
Laurent Sifre
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
ArXiv (abs)PDFHTML

Papers citing "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm"

50 / 839 papers shown
Title
A Review of Nine Physics Engines for Reinforcement Learning Research
A Review of Nine Physics Engines for Reinforcement Learning Research
Michael Kaup
Cornelius Wolff
Hyerim Hwang
Julius Mayer
Elia Bruni
AI4CE
262
15
0
11 Jul 2024
A Review of the Applications of Deep Learning-Based Emergent
  Communication
A Review of the Applications of Deep Learning-Based Emergent Communication
Brendon Boldt
David R. Mortensen
VLM
221
17
0
03 Jul 2024
Towards Faster Matrix Diagonalization with Graph Isomorphism Networks
  and the AlphaZero Framework
Towards Faster Matrix Diagonalization with Graph Isomorphism Networks and the AlphaZero Framework
Geigh Zollicoffer
Kshitij Bhatta
Manish Bhattarai
Phil Romero
C. Negre
Anders M. N. Niklasson
A. Adedoyin
108
0
0
30 Jun 2024
PUZZLES: A Benchmark for Neural Algorithmic Reasoning
PUZZLES: A Benchmark for Neural Algorithmic Reasoning
Benjamin Estermann
Luca A. Lanzendörfer
Yannick Niedermayr
Roger Wattenhofer
319
11
0
29 Jun 2024
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime
  Adaptive Execution using Informed Data and LLMs
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs
Raeid Saqur
185
3
0
20 Jun 2024
Exploring and Benchmarking the Planning Capabilities of Large Language
  Models
Exploring and Benchmarking the Planning Capabilities of Large Language Models
Bernd Bohnet
Azade Nova
Aaron T Parisi
Kevin Swersky
Katayoon Goshvadi
Hanjun Dai
Dale Schuurmans
Noah Fiedel
Hanie Sedghi
179
15
0
18 Jun 2024
A Systematization of the Wagner Framework: Graph Theory Conjectures and
  Reinforcement Learning
A Systematization of the Wagner Framework: Graph Theory Conjectures and Reinforcement LearningIFIP Working Conference on Database Semantics (IWDS), 2024
Flora Angileri
Giulia Lombardi
Andrea Fois
Renato Faraone
C. Metta
...
M. Fantozzi
S. Galfrè
Daniele Pavesi
Maurizio Parton
F. Morandin
137
5
0
18 Jun 2024
Transcendence: Generative Models Can Outperform The Experts That Train
  Them
Transcendence: Generative Models Can Outperform The Experts That Train Them
Edwin Zhang
Vincent Zhu
Sham Kakade
Anat Kleiman
Benjamin L. Edelman
Milind Tambe
Sham Kakade
Eran Malach
433
22
0
17 Jun 2024
UniZero: Generalized and Efficient Planning with Scalable Latent World Models
UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Yuan Pu
Yazhe Niu
Jiyuan Ren
Zhenjie Yang
Hongsheng Li
Yu Liu
OffRL
465
9
0
15 Jun 2024
ContraSolver: Self-Alignment of Language Models by Resolving Internal
  Preference Contradictions
ContraSolver: Self-Alignment of Language Models by Resolving Internal Preference Contradictions
Xu Zhang
Xunjian Yin
Xiaojun Wan
165
3
0
13 Jun 2024
Planning Like Human: A Dual-process Framework for Dialogue Planning
Planning Like Human: A Dual-process Framework for Dialogue PlanningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Tao He
Lizi Liao
Yixin Cao
Yuanxing Liu
Ming Liu
Zerui Chen
Bing Qin
293
41
0
08 Jun 2024
GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents
GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents
Anthony Costarelli
Mat Allen
Roman Hauksson
Grace Sodunke
Suhas Hariharan
Carlson Cheng
Wenjie Li
Joshua Clymer
Arjun Yadav
ELMReLMLLMAGLRM
268
44
0
07 Jun 2024
Open-Endedness is Essential for Artificial Superhuman Intelligence
Open-Endedness is Essential for Artificial Superhuman IntelligenceInternational Conference on Machine Learning (ICML), 2024
Edward Hughes
Michael Dennis
Jack Parker-Holder
Feryal M. P. Behbahani
Aditi Mavalankar
Yuge Shi
Tom Schaul
Tim Rocktaschel
LRM
334
54
0
06 Jun 2024
Discovering the curriculum with AI: A proof-of-concept demonstration with an intelligent tutoring system for teaching project selection
Discovering the curriculum with AI: A proof-of-concept demonstration with an intelligent tutoring system for teaching project selection
Lovis Heindrich
Falk Lieder
195
1
0
06 Jun 2024
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents
Yifei Wang
Dizhan Xue
Shengjie Zhang
Shengsheng Qian
AAMLLLMAG
198
72
0
05 Jun 2024
Multi-Agent Transfer Learning via Temporal Contrastive Learning
Multi-Agent Transfer Learning via Temporal Contrastive Learning
Weihao Zeng
Joseph Campbell
Simon Stepputtis
Katia Sycara
OffRL
221
2
0
03 Jun 2024
Exploring the limits of Hierarchical World Models in Reinforcement
  Learning
Exploring the limits of Hierarchical World Models in Reinforcement Learning
Robin Schiewer
Anand Subramoney
Laurenz Wiskott
227
7
0
01 Jun 2024
VQA Training Sets are Self-play Environments for Generating Few-shot
  Pools
VQA Training Sets are Self-play Environments for Generating Few-shot Pools
Tautvydas Misiunas
Hassan Mansoor
Jasper Uijlings
Oriana Riva
Victor Carbune
LRMVLM
154
1
0
30 May 2024
Training-efficient density quantum machine learning
Training-efficient density quantum machine learning
Brian Coyle
El Amine Cherrat
Nishant Jain
Natansh Mathur
Snehal Raj
Skander Kazdaghli
Iordanis Kerenidis
317
10
0
30 May 2024
No $D_{\text{train}}$: Model-Agnostic Counterfactual Explanations Using Reinforcement Learning
No DtrainD_{\text{train}}Dtrain​: Model-Agnostic Counterfactual Explanations Using Reinforcement Learning
Xiangyu Sun
Raquel Aoki
Kevin H. Wilson
230
1
0
28 May 2024
Competing for pixels: a self-play algorithm for weakly-supervised
  segmentation
Competing for pixels: a self-play algorithm for weakly-supervised segmentation
Shaheer U. Saeed
Shiqi Huang
João Ramalhinho
Iani J. M. B. Gayo
Nina Montaña-Brown
...
Stephen P. Pereira
Brian R. Davidson
D. Barratt
Matthew J. Clarkson
Yipeng Hu
277
0
0
26 May 2024
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep
  Reinforcement Learning
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
Shuai Zhang
Heshan Devaka Fernando
Miao Liu
K. Murugesan
Songtao Lu
Pin-Yu Chen
Tianyi Chen
Meng Wang
220
3
0
24 May 2024
Multi-turn Reinforcement Learning from Preference Human Feedback
Multi-turn Reinforcement Learning from Preference Human Feedback
Lior Shani
Aviv Rosenberg
Asaf B. Cassel
Oran Lang
Daniele Calandriello
...
Bilal Piot
Idan Szpektor
Avinatan Hassidim
Yossi Matias
Rémi Munos
211
58
0
23 May 2024
Mixture of Public and Private Distributions in Imperfect Information
  Games
Mixture of Public and Private Distributions in Imperfect Information Games
Jérôme Arjonilla
Abdallah Saffidine
Tristan Cazenave
257
1
0
23 May 2024
Deep Reinforcement Learning for 5*5 Multiplayer Go
Deep Reinforcement Learning for 5*5 Multiplayer Go
Brahim Driss
Jérôme Arjonilla
Hui Wang
Abdallah Saffidine
Tristan Cazenave
848
0
0
23 May 2024
Pure Planning to Pure Policies and In Between with a Recursive Tree
  Planner
Pure Planning to Pure Policies and In Between with a Recursive Tree Planner
A. N. Redlich
126
0
0
21 May 2024
A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning
  Systems: Survey and Taxonomy
A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning Systems: Survey and Taxonomy
Zhaoxing Li
176
2
0
16 May 2024
LLMs can learn self-restraint through iterative self-reflection
LLMs can learn self-restraint through iterative self-reflection
Alexandre Piché
Aristides Milios
Dzmitry Bahdanau
Chris Pal
326
6
0
15 May 2024
Aspect-based Sentiment Evaluation of Chess Moves (ASSESS): an NLP-based
  Method for Evaluating Chess Strategies from Textbooks
Aspect-based Sentiment Evaluation of Chess Moves (ASSESS): an NLP-based Method for Evaluating Chess Strategies from Textbooks
Haifa Alrdahi
Riza Batista-Navarro
177
1
0
10 May 2024
LLMs with Personalities in Multi-issue Negotiation Games
LLMs with Personalities in Multi-issue Negotiation Games
Sean Noh
Ho-Chun Herbert Chang
LLMAG
349
20
0
08 May 2024
Super-Exponential Regret for UCT, AlphaGo and Variants
Super-Exponential Regret for UCT, AlphaGo and Variants
Laurent Orseau
Rémi Munos
ELM
141
2
0
07 May 2024
Philosophy of Cognitive Science in the Age of Deep Learning
Philosophy of Cognitive Science in the Age of Deep Learning
Raphaël Millière
AI4CENAI
203
7
0
07 May 2024
HUGO -- Highlighting Unseen Grid Options: Combining Deep Reinforcement
  Learning with a Heuristic Target Topology Approach
HUGO -- Highlighting Unseen Grid Options: Combining Deep Reinforcement Learning with a Heuristic Target Topology Approach
Malte Lehna
Clara Holzhuter
Sven Tomforde
Christoph Scholz
198
7
0
01 May 2024
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference
  Learning
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning
Yuxi Xie
Anirudh Goyal
Wenyue Zheng
Min-Yen Kan
Timothy Lillicrap
Kenji Kawaguchi
Michael Shieh
ReLMLRM
390
194
0
01 May 2024
Learning to Beat ByteRL: Exploitability of Collectible Card Game Agents
Learning to Beat ByteRL: Exploitability of Collectible Card Game Agents
Radovan Haluška
Martin Schmid
LLMAG
196
0
0
25 Apr 2024
Playing Board Games with the Predict Results of Beam Search Algorithm
Playing Board Games with the Predict Results of Beam Search Algorithm
Sergey Pastukhov
90
0
0
23 Apr 2024
A Survey on Self-Evolution of Large Language Models
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRMLM&Ro
288
45
0
22 Apr 2024
Toward Self-Improvement of LLMs via Imagination, Searching, and
  Criticizing
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Ye Tian
Baolin Peng
Linfeng Song
Lifeng Jin
Dian Yu
Haitao Mi
Dong Yu
LRMReLM
235
124
0
18 Apr 2024
Monte Carlo Search Algorithms Discovering Monte Carlo Tree Search
  Exploration Terms
Monte Carlo Search Algorithms Discovering Monte Carlo Tree Search Exploration Terms
Tristan Cazenave
230
0
0
14 Apr 2024
Mitigating Cascading Effects in Large Adversarial Graph Environments
Mitigating Cascading Effects in Large Adversarial Graph Environments
James Cunningham
Conrad S. Tucker
AI4CEAAML
134
0
0
12 Apr 2024
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
Sequential Decision Making with Expert Demonstrations under Unobserved HeterogeneityNeural Information Processing Systems (NeurIPS), 2024
Vahid Balazadeh Meresht
Keertana Chidambaram
Viet Nguyen
Fahad Razak
Vasilis Syrgkanis
402
2
0
10 Apr 2024
Insights from the Use of Previously Unseen Neural Architecture Search
  Datasets
Insights from the Use of Previously Unseen Neural Architecture Search DatasetsComputer Vision and Pattern Recognition (CVPR), 2024
Rob Geada
David Towers
M. Forshaw
Amir Atapour-Abarghouei
A. Mcgough
258
5
0
02 Apr 2024
Cooperative Evolutionary Pressure and Diminishing Returns Might Explain
  the Fermi Paradox: On What Super-AIs Are Like
Cooperative Evolutionary Pressure and Diminishing Returns Might Explain the Fermi Paradox: On What Super-AIs Are Like
Daniel Vallstrom
304
0
0
01 Apr 2024
STaR-GATE: Teaching Language Models to Ask Clarifying Questions
STaR-GATE: Teaching Language Models to Ask Clarifying Questions
Chinmaya Andukuri
Jan-Philipp Fränken
Tobias Gerstenberg
Noah D. Goodman
SyDaLRM
305
68
0
28 Mar 2024
A survey on Concept-based Approaches For Model Improvement
A survey on Concept-based Approaches For Model Improvement
Avani Gupta
P. J. Narayanan
LRM
288
5
0
21 Mar 2024
JaxUED: A simple and useable UED library in Jax
JaxUED: A simple and useable UED library in Jax
Samuel Coward
Michael Beukman
Jakob Foerster
181
8
0
19 Mar 2024
Language Evolution with Deep Learning
Language Evolution with Deep Learning
Mathieu Rita
Paul Michel
Rahma Chaabouni
Olivier Pietquin
Emmanuel Dupoux
Florian Strub
185
3
0
18 Mar 2024
Quiet-STaR: Language Models Can Teach Themselves to Think Before
  Speaking
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
E. Zelikman
Georges Harik
Yijia Shao
Varuna Jayasiri
Nick Haber
Noah D. Goodman
LLMAGReLMLRM
643
202
0
14 Mar 2024
Teaching Large Language Models to Reason with Reinforcement Learning
Teaching Large Language Models to Reason with Reinforcement Learning
Alex Havrilla
Yuqing Du
Sharath Chandra Raparthy
Christoforos Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Sainbayar Sukhbaatar
Roberta Raileanu
ReLMLRM
252
141
0
07 Mar 2024
EfficientZero V2: Mastering Discrete and Continuous Control with Limited
  Data
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Shengjie Wang
Shaohuai Liu
Weirui Ye
Jiacheng You
Yang Gao
OffRL
222
27
0
01 Mar 2024
Previous
123456...151617
Next