ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.01815
  4. Cited By
Mastering Chess and Shogi by Self-Play with a General Reinforcement
  Learning Algorithm

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

5 December 2017
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
A. Guez
Marc Lanctot
Laurent Sifre
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
ArXiv (abs)PDFHTML

Papers citing "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm"

50 / 839 papers shown
Accelerating Monte Carlo Tree Search with Probability Tree State
  Abstraction
Accelerating Monte Carlo Tree Search with Probability Tree State AbstractionNeural Information Processing Systems (NeurIPS), 2023
Yangqing Fu
Mingdong Sun
Buqing Nie
Yue Gao
186
5
0
10 Oct 2023
A Unified View on Solving Objective Mismatch in Model-Based
  Reinforcement Learning
A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning
Ran Wei
Nathan Lambert
Anthony D. McDonald
Alfredo Garcia
Roberto Calandra
312
9
0
10 Oct 2023
Learning Interactive Real-World Simulators
Learning Interactive Real-World SimulatorsInternational Conference on Learning Representations (ICLR), 2023
Mengjiao Yang
Yilun Du
Kamyar Ghasemipour
Jonathan Tompson
Leslie Kaelbling
Dale Schuurmans
Pieter Abbeel
LM&RoPINN
345
334
0
09 Oct 2023
Hierarchical Reinforcement Learning for Temporal Pattern Prediction
Hierarchical Reinforcement Learning for Temporal Pattern Prediction
Faith Johnson
Kristin J. Dana
65
0
0
09 Oct 2023
Multi-timestep models for Model-based Reinforcement Learning
Multi-timestep models for Model-based Reinforcement Learning
Khyati Khandelwal
Giuseppe Paolo
Albert Thomas
Maurizio Filippone
Jun Yao
OffRL
220
1
0
09 Oct 2023
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA
  Moderna -- "The New Electricity": Applications, Risks, and Trends in Current
  AI
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA Moderna -- "The New Electricity": Applications, Risks, and Trends in Current AI
A. Bazzan
Anderson R. Tavares
André G. Pereira
C. R. Jung
Jacob Scharcanski
J. Carbonera
Luís C. Lamb
Mariana Recamonde Mendoza
T. L. T. D. Silveira
V. P. Moreira
152
0
0
08 Oct 2023
Language Agent Tree Search Unifies Reasoning Acting and Planning in
  Language Models
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language ModelsInternational Conference on Machine Learning (ICML), 2023
Xiaoxiao Sun
Yang Yang
Michal Shlapentokh-Rothman
Haohan Wang
Yu-Xiong Wang
LRMAI4CELM&RoLLMAG
439
324
0
06 Oct 2023
Discovering General Reinforcement Learning Algorithms with Adversarial
  Environment Design
Discovering General Reinforcement Learning Algorithms with Adversarial Environment DesignNeural Information Processing Systems (NeurIPS), 2023
Matthew Jackson
Minqi Jiang
Jack Parker-Holder
Risto Vuorio
Chris Xiaoxuan Lu
Gregory Farquhar
Shimon Whiteson
Jakob N. Foerster
OOD
226
19
0
04 Oct 2023
Differentially Encoded Observation Spaces for Perceptive Reinforcement
  Learning
Differentially Encoded Observation Spaces for Perceptive Reinforcement LearningIEEE International Conference on Robotics and Automation (ICRA), 2023
Lev Grossman
Brian Plancher
OffRL
207
1
0
03 Oct 2023
Iterative Option Discovery for Planning, by Planning
Iterative Option Discovery for Planning, by Planning
Kenny Young
Richard S. Sutton
384
2
0
02 Oct 2023
LEGO-Prover: Neural Theorem Proving with Growing Libraries
LEGO-Prover: Neural Theorem Proving with Growing LibrariesInternational Conference on Learning Representations (ICLR), 2023
Haiming Wang
Huajian Xin
Chuanyang Zheng
Lin Li
Zhengying Liu
...
Enze Xie
Jian Yin
Zhenguo Li
Heng Liao
Xiaodan Liang
LRM
379
108
0
01 Oct 2023
Reinforcement Learning for Node Selection in Branch-and-Bound
Reinforcement Learning for Node Selection in Branch-and-Bound
Alexander Mattick
Christopher Mutschler
250
4
0
29 Sep 2023
Optimizing with Low Budgets: a Comparison on the Black-box Optimization
  Benchmarking Suite and OpenAI Gym
Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI GymIEEE Transactions on Evolutionary Computation (IEEE TEVC), 2023
Elena Raponi
Nathanaël Carraz Rakotonirina
Jérémy Rapin
Carola Doerr
O. Teytaud
569
9
0
29 Sep 2023
Efficiency Separation between RL Methods: Model-Free, Model-Based and
  Goal-Conditioned
Efficiency Separation between RL Methods: Model-Free, Model-Based and Goal-Conditioned
Han Bao
Raphaël Jungers
Jean-Charles Delvenne
OffRL
196
1
0
28 Sep 2023
Vision Transformers for Computer Go
Vision Transformers for Computer Go
Amani Sagri
Tristan Cazenave
Jérôme Arjonilla
Abdallah Saffidine
ViT
125
4
0
22 Sep 2023
Monte-Carlo tree search with uncertainty propagation via optimal
  transport
Monte-Carlo tree search with uncertainty propagation via optimal transport
Tuan Dam
Pascal Stenger
Lukas Schneider
Joni Pajarinen
Carlo DÉramo
Odalric-Ambrym Maillard
171
2
0
19 Sep 2023
MBAPPE: MCTS-Built-Around Prediction for Planning Explicitly
MBAPPE: MCTS-Built-Around Prediction for Planning Explicitly
Raphael Chekroun
Thomas Gilles
Marin Toromanoff
Sascha Hornauer
Fabien Moutarde
236
17
0
15 Sep 2023
Fidelity-Induced Interpretable Policy Extraction for Reinforcement
  Learning
Fidelity-Induced Interpretable Policy Extraction for Reinforcement Learning
Xiao Liu
Wubing Chen
Mao Tan
204
2
0
12 Sep 2023
Neurosymbolic Reinforcement Learning and Planning: A Survey
Neurosymbolic Reinforcement Learning and Planning: A SurveyIEEE Transactions on Artificial Intelligence (IEEE TAI), 2023
Kamal Acharya
Waleed Raza
Carlos Dourado
Alvaro Velasquez
Houbing Song
NAIOffRL
230
39
0
02 Sep 2023
DRL-Based Trajectory Tracking for Motion-Related Modules in Autonomous
  Driving
DRL-Based Trajectory Tracking for Motion-Related Modules in Autonomous Driving
Yinda Xu
Lidong Yu
249
7
0
30 Aug 2023
Stabilizing Unsupervised Environment Design with a Learned Adversary
Stabilizing Unsupervised Environment Design with a Learned Adversary
Ishita Mediratta
Minqi Jiang
Jack Parker-Holder
Michael Dennis
Eugene Vinitsky
Tim Rocktaschel
329
19
0
21 Aug 2023
DFB: A Data-Free, Low-Budget, and High-Efficacy Clean-Label Backdoor
  Attack
DFB: A Data-Free, Low-Budget, and High-Efficacy Clean-Label Backdoor Attack
Binhao Ma
Jiahui Wang
Dejun Wang
Bo Meng
AAML
178
0
0
18 Aug 2023
Generating Personas for Games with Multimodal Adversarial Imitation
  Learning
Generating Personas for Games with Multimodal Adversarial Imitation Learning
William Ahlberg
Alessandro Sestini
Konrad Tollmar
Linus Gisslén
GAN
162
8
0
15 Aug 2023
Provably Efficient Algorithm for Nonstationary Low-Rank MDPs
Provably Efficient Algorithm for Nonstationary Low-Rank MDPsNeural Information Processing Systems (NeurIPS), 2023
Yuan Cheng
J. Yang
Yitao Liang
OOD
217
1
0
10 Aug 2023
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player
  Zero-Sum Games
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games
Yang Li
Kun Xiong
Yingping Zhang
Jiangcheng Zhu
Alexander Shmakov
Wei Pan
Jun Wang
Zonghong Dai
Yaodong Yang
292
3
0
09 Aug 2023
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Michaël Mathieu
Sherjil Ozair
Srivatsan Srinivasan
Çağlar Gülçehre
Shangtong Zhang
...
Sergio Gomez Colmenarejo
Aaron van den Oord
Wojciech M. Czarnecki
Nando de Freitas
Oriol Vinyals
OffRL
176
14
0
07 Aug 2023
CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters
CASSINI: Network-Aware Job Scheduling in Machine Learning ClustersSymposium on Networked Systems Design and Implementation (NSDI), 2023
S. Rajasekaran
M. Ghobadi
Aditya Akella
GNN
135
87
0
01 Aug 2023
SAKSHI: Decentralized AI Platforms
SAKSHI: Decentralized AI Platforms
S. Bhat
Canhui Chen
Zerui Cheng
Zhixuan Fang
Ashwin Hebbar
...
Ranvir Rana
Peiyao Sheng
Himanshu Tyagi
Pramod Viswanath
Xuechao Wang
101
7
0
31 Jul 2023
Thinker: Learning to Plan and Act
Thinker: Learning to Plan and ActNeural Information Processing Systems (NeurIPS), 2023
Stephen Chung
Ivan Anokhin
David M. Krueger
LLMAGOffRLLRM
294
12
0
27 Jul 2023
Towards General Game Representations: Decomposing Games Pixels into
  Content and Style
Towards General Game Representations: Decomposing Games Pixels into Content and Style
C. Trivedi
Konstantinos Makantasis
Antonios Liapis
Georgios N. Yannakakis
OCL
199
3
0
20 Jul 2023
PyTAG: Challenges and Opportunities for Reinforcement Learning in
  Tabletop Games
PyTAG: Challenges and Opportunities for Reinforcement Learning in Tabletop Games
Martin Balla
G. E. Long
Dominik Jeurissen
J. Goodman
Raluca D. Gaina
Diego Perez-Liebana
LMTDOffRLOnRL
203
3
0
19 Jul 2023
Towards A Unified Agent with Foundation Models
Towards A Unified Agent with Foundation Models
Norman Di Palo
Arunkumar Byravan
Leonard Hasenclever
Markus Wulfmeier
N. Heess
Martin Riedmiller
LM&RoLLMAGOffRL
241
70
0
18 Jul 2023
Reasoning or Reciting? Exploring the Capabilities and Limitations of
  Language Models Through Counterfactual Tasks
Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual TasksNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Zhaofeng Wu
Linlu Qiu
Alexis Ross
Ekin Akyürek
Boyuan Chen
Bailin Wang
Najoung Kim
Jacob Andreas
Yoon Kim
LRMReLM
426
302
0
05 Jul 2023
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of
  Circular Cylinder with Sparse Surface Pressure Sensing
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure SensingJournal of Fluid Mechanics (JFM), 2023
Qiulei Wang
Lei Yan
Gang Hu
Wenli Chen
Jean Rabault
B. R. Noack
AI4CE
240
45
0
05 Jul 2023
Enhancing Dexterity in Robotic Manipulation via Hierarchical Contact
  Exploration
Enhancing Dexterity in Robotic Manipulation via Hierarchical Contact ExplorationIEEE Robotics and Automation Letters (RA-L), 2023
Xianyi Cheng
Sarvesh Patil
Zeynep Temel
Oliver Kroemer
M. T. Mason
341
38
0
01 Jul 2023
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning
  Environments for Goal-Oriented Tasks
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented TasksNeural Information Processing Systems (NeurIPS), 2023
Maxime Chevalier-Boisvert
Bolun Dai
Mark Towers
Rodrigo de Lazcano
Lucas Willems
Salem Lahlou
Suman Pal
Pablo Samuel Castro
Jordan Terry
VGen
353
308
0
24 Jun 2023
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality GapInternational Conference on Machine Learning (ICML), 2023
Hang Wang
Sen Lin
Junshan Zhang
OffRLOnRL
226
4
0
20 Jun 2023
iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed
  Multi-Agent Reinforcement Learning
iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning
Xiyang Wu
Rohan Chandra
Tianrui Guan
Amrit Singh Bedi
Tianyi Zhou
364
5
0
09 Jun 2023
Introduction to Latent Variable Energy-Based Models: A Path Towards
  Autonomous Machine Intelligence
Introduction to Latent Variable Energy-Based Models: A Path Towards Autonomous Machine IntelligenceJournal of Statistical Mechanics: Theory and Experiment (J. Stat. Mech.), 2023
Anna Dawid
Yann LeCun
DRL
260
47
0
05 Jun 2023
Active Vision Reinforcement Learning under Limited Visual Observability
Active Vision Reinforcement Learning under Limited Visual ObservabilityNeural Information Processing Systems (NeurIPS), 2023
Jinghuan Shang
Michael S. Ryoo
313
0
0
01 Jun 2023
Non-stationary Reinforcement Learning under General Function
  Approximation
Non-stationary Reinforcement Learning under General Function ApproximationInternational Conference on Machine Learning (ICML), 2023
Songtao Feng
Ming Yin
Ruiquan Huang
Yu Wang
J. Yang
Yitao Liang
158
10
0
01 Jun 2023
Cross-Domain Policy Adaptation via Value-Guided Data Filtering
Cross-Domain Policy Adaptation via Value-Guided Data FilteringNeural Information Processing Systems (NeurIPS), 2023
Kang Xu
Chenjia Bai
Xiaoteng Ma
Dong Wang
Bingyan Zhao
Zhen Wang
Xuelong Li
Wei Li
309
26
0
28 May 2023
Self-Supervised Reinforcement Learning that Transfers using Random
  Features
Self-Supervised Reinforcement Learning that Transfers using Random FeaturesNeural Information Processing Systems (NeurIPS), 2023
Boyuan Chen
Chuning Zhu
Pulkit Agrawal
Jianchao Tan
Abhishek Gupta
OffRLSSL
247
12
0
26 May 2023
Model-Based Simulation for Optimising Smart Reply
Model-Based Simulation for Optimising Smart ReplyAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Benjamin Towle
Ke Zhou
177
1
0
26 May 2023
Reasoning with Language Model is Planning with World Model
Reasoning with Language Model is Planning with World ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shibo Hao
Yi Gu
Haodi Ma
Joshua Jiahua Hong
Zhen Wang
D. Wang
Zhiting Hu
ReLMLRMLLMAG
450
831
0
24 May 2023
ADA-GP: Accelerating DNN Training By Adaptive Gradient Prediction
ADA-GP: Accelerating DNN Training By Adaptive Gradient PredictionMicro (MICRO), 2023
Vahid Janfaza
Shantanu Mandal
Farabi Mahmud
A. Muzahid
161
5
0
22 May 2023
Discovering Individual Rewards in Collective Behavior through Inverse
  Multi-Agent Reinforcement Learning
Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning
Daniel Waelchli
Pascal Weber
Petros Koumoutsakos
AI4CE
218
4
0
17 May 2023
Large Language Model Guided Tree-of-Thought
Large Language Model Guided Tree-of-Thought
Jieyi Long
LM&RoLRM
230
282
0
15 May 2023
Stackelberg Games for Learning Emergent Behaviors During Competitive
  Autocurricula
Stackelberg Games for Learning Emergent Behaviors During Competitive AutocurriculaIEEE International Conference on Robotics and Automation (ICRA), 2023
Boling Yang
Liyuan Zheng
Lillian J. Ratliff
Byron Boots
Joshua R. Smith
197
6
0
04 May 2023
Physics-Inspired Interpretability Of Machine Learning Models
Physics-Inspired Interpretability Of Machine Learning Models
Maximilian P. Niroomand
D. Wales
AI4CE
108
1
0
05 Apr 2023
Previous
123...678...151617
Next
Page 7 of 17
Pageof 17