Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.05098
Cited By
v1
v2
v3 (latest)
DiCE: The Infinitely Differentiable Monte-Carlo Estimator
14 February 2018
Jakob N. Foerster
Gregory Farquhar
Maruan Al-Shedivat
Tim Rocktaschel
Eric Xing
Shimon Whiteson
Re-assign community
ArXiv (abs)
PDF
HTML
Github (148★)
Papers citing
"DiCE: The Infinitely Differentiable Monte-Carlo Estimator"
50 / 66 papers shown
Title
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Zishun Yu
Tengyu Xu
Di Jin
Karthik Abinav Sankararaman
Yun He
...
Eryk Helenowski
Chen Zhu
Sinong Wang
Hao Ma
Han Fang
LRM
230
11
0
29 Jan 2025
Imperative Learning: A Self-supervised Neuro-Symbolic Learning Framework for Robot Autonomy
Chen Wang
Kaiyi Ji
Junyi Geng
Zhongqiang Ren
Taimeng Fu
...
Yi Du
Qihang Li
Yue Yang
Xiao Lin
Zhipeng Zhao
SSL
160
10
0
28 Jan 2025
Hierarchical Multi-agent Meta-Reinforcement Learning for Cross-channel Bidding
Shenghong He
Chao Yu
92
0
0
26 Dec 2024
Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference
Matthew D Riemer
G. Subbaraj
Glen Berseth
Irina Rish
OffRL
140
2
0
18 Dec 2024
Multi-agent cooperation through learning-aware policy gradients
Alexander Meulemans
Seijin Kobayashi
J. Oswald
Nino Scherrer
Eric Elmoznino
Blake A. Richards
Guillaume Lajoie
Blaise Agüera y Arcas
João Sacramento
79
1
0
24 Oct 2024
Probabilistic Programming with Programmable Variational Inference
McCoy R. Becker
Alexander K. Lew
Xiaoyan Wang
Matin Ghavami
Mathieu Huot
Martin Rinard
Vikash K. Mansinghka
131
5
0
22 Jun 2024
Advantage Alignment Algorithms
Juan Agustin Duque
Milad Aghajohari
Tim Cooijmans
Tianyu Zhang
Rameswar Panda
Gauthier Gidel
Aaron Courville
82
2
0
20 Jun 2024
LOQA: Learning with Opponent Q-Learning Awareness
Milad Aghajohari
Juan Agustin Duque
Tim Cooijmans
Rameswar Panda
74
4
0
02 May 2024
ULLER: A Unified Language for Learning and Reasoning
Emile van Krieken
Samy Badreddine
Robin Manhaeve
Eleonora Giunchiglia
NAI
110
3
0
01 May 2024
Differentiable and Stable Long-Range Tracking of Multiple Posterior Modes
Ali Younis
Erik B. Sudderth
61
4
0
12 Apr 2024
Leading the Pack: N-player Opponent Shaping
Alexandra Souly
Timon Willi
Akbir Khan
Robert Kirk
Chris Xiaoxuan Lu
Edward Grefenstette
Tim Rocktaschel
131
3
0
19 Dec 2023
Meta-Value Learning: a General Framework for Learning with Learning Awareness
Tim Cooijmans
Milad Aghajohari
Rameswar Panda
64
6
0
17 Jul 2023
Incentivizing honest performative predictions with proper scoring rules
Caspar Oesterheld
Johannes Treutlein
Emery Cooper
Rubi Hudson
69
7
0
28 May 2023
Off-Policy Action Anticipation in Multi-Agent Reinforcement Learning
Ariyan Bighashdel
Daan de Geus
P. Jancura
Gijs Dubbelman
OffRL
LRM
48
1
0
04 Apr 2023
Coordinating Fully-Cooperative Agents Using Hierarchical Learning Anticipation
Ariyan Bighashdel
Daan de Geus
P. Jancura
Gijs Dubbelman
47
1
0
15 Mar 2023
Learning Adaptable Risk-Sensitive Policies to Coordinate in Multi-Agent General-Sum Games
Ziyi Liu
Yongchun Fang
45
1
0
14 Mar 2023
Learning to Influence Human Behavior with Offline Reinforcement Learning
Joey Hong
Sergey Levine
Anca Dragan
OffRL
AI4CE
85
0
0
03 Mar 2023
Differentiable Simulations for Enhanced Sampling of Rare Events
Martin Sípka
Johannes C. B. Dietschreit
Lukáš Grajciar
Rafael Gómez-Bombarelli
127
11
0
09 Jan 2023
Reusable Options through Gradient-based Meta Learning
David Kuric
H. V. Hoof
91
0
0
22 Dec 2022
ADEV: Sound Automatic Differentiation of Expected Values of Probabilistic Programs
Alexander K. Lew
Mathieu Huot
S. Staton
Vikash K. Mansinghka
64
22
0
13 Dec 2022
Proximal Learning With Opponent-Learning Awareness
S. Zhao
Chris Xiaoxuan Lu
Roger C. Grosse
Jakob N. Foerster
80
21
0
18 Oct 2022
Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments
Desik Rengarajan
Sapana Chaudhary
JaeWon Kim
D. Kalathil
S. Shakkottai
OffRL
65
2
0
26 Sep 2022
An Investigation of the Bias-Variance Tradeoff in Meta-Gradients
Risto Vuorio
Jacob Beck
Shimon Whiteson
Jakob N. Foerster
Gregory Farquhar
87
7
0
22 Sep 2022
Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization
Leo Feng
Padideh Nouri
Aneri Muni
Yoshua Bengio
Pierre-Luc Bacon
170
4
0
13 Sep 2022
Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning
Haoqi Yuan
Zongqing Lu
SSL
OffRL
86
42
0
21 Jun 2022
Critic Sequential Monte Carlo
Vasileios Lioutas
J. Lavington
Justice Sefas
Matthew Niedoba
Yunpeng Liu
Berend Zwartsenberg
Setareh Dabiri
Frank Wood
Adam Scibior
105
7
0
30 May 2022
Sparse Graph Learning from Spatiotemporal Time Series
Andrea Cini
Daniele Zambon
Cesare Alippi
CML
AI4TS
125
19
0
26 May 2022
Model-Free Opponent Shaping
Chris Xiaoxuan Lu
Timon Willi
Christian Schroeder de Witt
Jakob N. Foerster
113
44
0
03 May 2022
COLA: Consistent Learning with Opponent-Learning Awareness
Timon Willi
Alistair Letcher
Johannes Treutlein
Jakob N. Foerster
70
52
0
08 Mar 2022
Influencing Long-Term Behavior in Multiagent Reinforcement Learning
Dong-Ki Kim
Matthew D Riemer
Miao Liu
Jakob N. Foerster
Michael Everett
Chuangchuang Sun
Gerald Tesauro
Jonathan P. How
145
0
0
07 Mar 2022
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
Xidong Feng
Bo Liu
Jie Ren
Luo Mai
Rui Zhu
Haifeng Zhang
Jun Wang
Yaodong Yang
96
12
0
31 Dec 2021
Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning
Yunhao Tang
68
7
0
14 Dec 2021
Meta-CPR: Generalize to Unseen Large Number of Agents with Communication Pattern Recognition Module
Wei-Cheng Tseng
Wei Wei
Da-Cheng Juan
Min Sun
83
2
0
14 Dec 2021
ROMAX: Certifiably Robust Deep Multiagent Reinforcement Learning via Convex Relaxation
Chuangchuang Sun
Dong-Ki Kim
Jonathan P. How
AAML
92
19
0
14 Sep 2021
Robust Predictable Control
Benjamin Eysenbach
Ruslan Salakhutdinov
Sergey Levine
OffRL
91
45
0
07 Sep 2021
Model-Based Opponent Modeling
Xiaopeng Yu
Jiechuan Jiang
Wanpeng Zhang
Haobin Jiang
Zongqing Lu
OffRL
109
29
0
04 Aug 2021
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Yunhao Tang
Tadashi Kozuno
Mark Rowland
Rémi Munos
Michal Valko
OffRL
136
9
0
24 Jun 2021
Differentiable Particle Filtering without Modifying the Forward Pass
Adam Scibior
Frank Wood
95
19
0
18 Jun 2021
Neural Auto-Curricula
Xidong Feng
Oliver Slumbers
Bo Liu
Bo Liu
Stephen Marcus McAleer
Ying Wen
Jun Wang
Yaodong Yang
96
2
0
04 Jun 2021
A unified view of likelihood ratio and reparameterization gradients
Paavo Parmas
Masashi Sugiyama
53
9
0
31 May 2021
Searching with Opponent-Awareness
Timy Phan
22
0
0
21 Apr 2021
Storchastic: A Framework for General Stochastic Automatic Differentiation
Emile van Krieken
Jakub M. Tomczak
A. T. Teije
ODL
OffRL
103
16
0
01 Apr 2021
Towards Continual Reinforcement Learning: A Review and Perspectives
Khimya Khetarpal
Matthew D Riemer
Irina Rish
Doina Precup
CLL
OffRL
142
324
0
25 Dec 2020
Direct Evolutionary Optimization of Variational Autoencoders With Binary Latents
E. Guiraud
Jakob Drefs
Jörg Lücke
DRL
83
3
0
27 Nov 2020
Opponent Learning Awareness and Modelling in Multi-Objective Normal Form Games
Roxana Rădulescu
T. Verstraeten
Yijie Zhang
Patrick Mannion
D. Roijers
A. Nowé
63
14
0
14 Nov 2020
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
Jack Parker-Holder
Luke Metz
Cinjon Resnick
Hengyuan Hu
Adam Lerer
Alistair Letcher
A. Peysakhovich
Aldo Pacchiano
Jakob N. Foerster
50
24
0
12 Nov 2020
Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences
Bowen Baker
LRM
69
37
0
10 Nov 2020
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
Dong-Ki Kim
Miao Liu
Matthew D Riemer
Chuangchuang Sun
Marwa Abdulhai
Golnaz Habibi
Sebastian Lopez-Cot
Gerald Tesauro
Jonathan P. How
56
56
0
31 Oct 2020
GO Hessian for Expectation-Based Objectives
Yulai Cong
Miaoyun Zhao
Jianqiao Li
Junya Chen
Lawrence Carin
46
0
0
16 Jun 2020
Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning
Kaiyi Ji
Junjie Yang
Yingbin Liang
103
50
0
18 Feb 2020
1
2
Next