Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.01815
Cited By
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
5 December 2017
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
A. Guez
Marc Lanctot
Laurent Sifre
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm"
50 / 207 papers shown
Title
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure Sensing
Qiulei Wang
Lei Yan
Gang Hu
Wenli Chen
Jean Rabault
B. R. Noack
AI4CE
23
24
0
05 Jul 2023
Model-Based Simulation for Optimising Smart Reply
Benjamin Towle
Ke Zhou
32
1
0
26 May 2023
Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning
Daniel Waelchli
Pascal Weber
P. Koumoutsakos
AI4CE
17
4
0
17 May 2023
Adaptive Feature Fusion: Enhancing Generalization in Deep Learning Models
Neelesh Mungoli
28
23
0
04 Apr 2023
Managing power grids through topology actions: A comparative study between advanced rule-based and reinforcement learning agents
Malte Lehna
J. Viebahn
Christoph Scholz
Antoine Marot
Sven Tomforde
24
19
0
03 Apr 2023
Meta-Learning Parameterized First-Order Optimizers using Differentiable Convex Optimization
Tanmay Gautam
Samuel Pfrommer
Somayeh Sojoudi
20
2
0
29 Mar 2023
A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games
Anna Winnicki
R. Srikant
34
1
0
17 Mar 2023
A Reinforcement Learning Approach for Scheduling Problems With Improved Generalization Through Order Swapping
Deepak Vivekanandan
Samuel Wirth
Patrick Karlbauer
Noah Klarmann
16
6
0
27 Feb 2023
TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play
Fanqing Lin
Shiyu Huang
Tim Pearce
Wenze Chen
Weijuan Tu
26
17
0
15 Feb 2023
Energy Efficiency of Training Neural Network Architectures: An Empirical Study
Yi Xu
Silverio Martínez-Fernández
Matias Martinez
Xavier Franch
20
13
0
02 Feb 2023
Policy-Value Alignment and Robustness in Search-based Multi-Agent Learning
Niko A. Grupen
M. Hanlon
Alexis Hao
Daniel D. Lee
B. Selman
19
0
0
27 Jan 2023
PushWorld: A benchmark for manipulation planning with tools and movable obstacles
Ken Kansky
Skanda Vaidyanath
Scott Swingle
Xinghua Lou
Miguel Lazaro-Gredilla
Dileep George
21
4
0
24 Jan 2023
Character Simulation Using Imitation Learning With Game Engine Physics
Joao Rodrigues
R. Nóbrega
AI4CE
14
2
0
05 Jan 2023
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
62
1,476
0
15 Dec 2022
Deep Incubation: Training Large Models by Divide-and-Conquering
Zanlin Ni
Yulin Wang
Jiangwei Yu
Haojun Jiang
Yu Cao
Gao Huang
VLM
18
11
0
08 Dec 2022
ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
Kai Hsu
D. Nguyen
J. F. Fisac
23
30
0
06 Dec 2022
Actively Learning Costly Reward Functions for Reinforcement Learning
André Eberhard
Houssam Metni
G. Fahland
A. Stroh
Pascal Friederich
OffRL
25
0
0
23 Nov 2022
Deep Instance Segmentation and Visual Servoing to Play Jenga with a Cost-Effective Robotic System
Luca Marchionna
G. Pugliese
Mauro Martini
Simone Angarano
Francesco Salvetti
Marcello Chiaberge
41
3
0
15 Nov 2022
Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation
Siddharth Nayak
Kenneth M. F. Choi
Wenqi Ding
Sydney I. Dolan
Karthik Gopalakrishnan
H. Balakrishnan
17
29
0
03 Nov 2022
Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration
Mesut Yang
Micah Carroll
Anca Dragan
29
13
0
03 Nov 2022
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
21
74
0
26 Oct 2022
Will we run out of data? Limits of LLM scaling based on human-generated data
Pablo Villalobos
A. Ho
J. Sevilla
T. Besiroglu
Lennart Heim
Marius Hobbhahn
ALM
33
109
0
26 Oct 2022
CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning with Demonstrations
Kai Yan
A. Schwing
Yu-xiong Wang
OffRL
30
2
0
18 Oct 2022
The Debate Over Understanding in AI's Large Language Models
Melanie Mitchell
D. Krakauer
ELM
74
203
0
14 Oct 2022
Efficient circuit implementation for coined quantum walks on binary trees and application to reinforcement learning
Thomas Mullor
David Vigouroux
Louis Bethune
14
0
0
13 Oct 2022
Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery
Félix Chalumeau
Raphael Boige
Bryan Lim
Valentin Macé
Maxime Allard
Arthur Flajolet
Antoine Cully
Thomas Pierrot
26
21
0
06 Oct 2022
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Oren Neumann
C. Gros
24
26
0
29 Sep 2022
Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter
Ruben Villarreal
Nikolaos N. Vlassis
Nhon N. Phan
Tommie A. Catanach
Reese E. Jones
N. Trask
S. Kramer
WaiChing Sun
OffRL
30
11
0
27 Sep 2022
Parallel Monte Carlo Tree Search with Batched Rigid-body Simulations for Speeding up Long-Horizon Episodic Robot Planning
Baichuan Huang
Abdeslam Boularias
Jingjin Yu
12
9
0
14 Jul 2022
Stabilizing Off-Policy Deep Reinforcement Learning from Pixels
Edoardo Cetin
Philip J. Ball
Steve Roberts
Oya Celiktutan
30
36
0
03 Jul 2022
EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine
Jiayi Weng
Min-Bin Lin
Shengyi Huang
Bo Liu
Denys Makoviichuk
...
Yufan Song
Ting Luo
Yukun Jiang
Zhongwen Xu
Shuicheng Yan
MoE
11
59
0
21 Jun 2022
A Survey on Model-based Reinforcement Learning
Fan Luo
Tian Xu
Hang Lai
Xiong-Hui Chen
Weinan Zhang
Yang Yu
OffRL
LRM
44
101
0
19 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
42
348
0
17 Jun 2022
Rapid Learning of Spatial Representations for Goal-Directed Navigation Based on a Novel Model of Hippocampal Place Fields
Adedapo Alabi
D. Vanderelst
A. Minai
9
2
0
05 Jun 2022
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search
Michał Zawalski
Michał Tyrolski
K. Czechowski
Tomasz Odrzygó'zd'z
Damian Stachura
Piotr Pikekos
Yuhuai Wu
Lukasz Kuciñski
Piotr Milo's
LRM
13
8
0
01 Jun 2022
HyperTree Proof Search for Neural Theorem Proving
Guillaume Lample
Marie-Anne Lachaux
Thibaut Lavril
Xavier Martinet
Amaury Hayat
Gabriel Ebner
Aurelien Rodriguez
Timothée Lacroix
AIMat
23
133
0
23 May 2022
Chain of Thought Imitation with Procedure Cloning
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
OffRL
30
29
0
22 May 2022
Adversarial Training for High-Stakes Reliability
Daniel M. Ziegler
Seraphina Nix
Lawrence Chan
Tim Bauman
Peter Schmidt-Nielsen
...
Noa Nabeshima
Benjamin Weinstein-Raun
D. Haas
Buck Shlegeris
Nate Thomas
AAML
30
59
0
03 May 2022
Graph Neural Network based Agent in Google Research Football
Yizhan Niu
Jinglong Liu
Yuhao Shi
Jiren Zhu
GNN
21
2
0
23 Apr 2022
Adversarial Learning to Reason in an Arbitrary Logic
Stanislaw J. Purgal
C. Kaliszyk
27
1
0
06 Apr 2022
PerfectDou: Dominating DouDizhu with Perfect Information Distillation
Yang Guan
Minghuan Liu
Weijun Hong
Weinan Zhang
Fei Fang
Guangjun Zeng
Yue Lin
25
26
0
30 Mar 2022
Remember and Forget Experience Replay for Multi-Agent Reinforcement Learning
Pascal Weber
Daniel Wälchli
Mustafa Zeqiri
P. Koumoutsakos
CLL
OffRL
10
7
0
24 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
Honghu Xue
Benedikt Hein
M. Bakr
Georg Schildbach
Bengt Abel
Elmar Rueckert
16
15
0
23 Feb 2022
Open-Ended Reinforcement Learning with Neural Reward Functions
Robert Meier
Asier Mujika
37
7
0
16 Feb 2022
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
25
269
0
11 Feb 2022
Uncovering Instabilities in Variational-Quantum Deep Q-Networks
Maja Franz
Lucas Wolf
Maniraman Periyasamy
Christian Ufrecht
Daniel D. Scherer
Axel Plinge
Christopher Mutschler
Wolfgang Mauerer
24
29
0
10 Feb 2022
Formal Mathematics Statement Curriculum Learning
Stanislas Polu
Jesse Michael Han
Kunhao Zheng
Mantas Baksys
Igor Babuschkin
Ilya Sutskever
AIMat
81
116
0
03 Feb 2022
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
27
166
0
08 Dec 2021
Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL
Charles Packer
Pieter Abbeel
Joseph E. Gonzalez
OffRL
21
18
0
02 Dec 2021
Previous
1
2
3
4
5
Next