Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1810.06721
Cited By
v1
v2 (latest)
Optimizing Agent Behavior over Long Time Scales by Transporting Value
15 October 2018
Chia-Chun Hung
Timothy Lillicrap
Josh Abramson
Yan Wu
M. Berk Mirza
Federico Carnevale
Arun Ahuja
Greg Wayne
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimizing Agent Behavior over Long Time Scales by Transporting Value"
50 / 82 papers shown
Tree Search for LLM Agent Reinforcement Learning
Yuxiang Ji
Ziyu Ma
Yong Wang
Guanhua Chen
Xiangxiang Chu
Liaoni Wu
AI4CE
253
21
0
25 Sep 2025
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Kazuki Irie
Morris Yau
Samuel J. Gershman
248
8
0
31 May 2025
The challenge of hidden gifts in multi-agent reinforcement learning
Dane Malenfant
Blake A. Richards
455
1
0
26 May 2025
Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning
Egor Cherepanov
Nikita Kachaev
A. Kovalev
Aleksandr I. Panov
OffRL
571
14
0
14 Feb 2025
Evolution and The Knightian Blindspot of Machine Learning
Joel Lehman
Elliot Meyerson
Tarek El-Gaaly
Kenneth O. Stanley
Tarin Ziyaee
385
7
0
22 Jan 2025
Token-level Proximal Policy Optimization for Query Generation
Yichen Ouyang
Lu Wang
Fangkai Yang
Lu Wang
Chenghua Huang
...
Saravan Rajmohan
Weiwei Deng
Dongmei Zhang
Feng Sun
Qi Zhang
OffRL
936
9
0
01 Nov 2024
VLASCD: A Visual Language Action Model for Simultaneous Chatting and Decision Making
Zuojin Tang
Bin-Bin Hu
Chenyang Zhao
De Ma
Gang Pan
Yinan Han
406
1
0
21 Oct 2024
Action abstractions for amortized sampling
International Conference on Learning Representations (ICLR), 2024
Oussama Boussif
Léna Néhale Ezzine
J. Viviano
Michał Koziarski
Moksh Jain
Nikolay Malkin
Emmanuel Bengio
Rim Assouel
Yoshua Bengio
261
3
0
19 Oct 2024
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning
International Conference on Learning Representations (ICLR), 2024
Hung Le
Kien Do
D. Nguyen
Sunil Gupta
Svetha Venkatesh
319
7
0
14 Oct 2024
Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL
Eduardo Pignatelli
Johan Ferret
Tim Rockäschel
Edward Grefenstette
Davide Paglieri
Samuel Coward
Laura Toni
324
11
0
19 Sep 2024
Equivariant Reinforcement Learning under Partial Observability
Conference on Robot Learning (CoRL), 2024
Hai Nguyen
Andrea Baisero
David Klee
Dian Wang
Robert Platt
Christopher Amato
265
26
0
26 Aug 2024
Variable-Agnostic Causal Exploration for Reinforcement Learning
Minh Hoang Nguyen
Hung Le
Svetha Venkatesh
CML
322
3
0
17 Jul 2024
Rethinking Transformers in Solving POMDPs
Chenhao Lu
Ruizhe Shi
Yuyao Liu
Kaizhe Hu
Simon S. Du
Huazhe Xu
AI4CE
461
9
0
27 May 2024
Mastering Memory Tasks with World Models
Mohammad Reza Samsami
Artem Zholus
Janarthanan Rajendran
Sarath Chandar
CLL
OffRL
390
41
0
07 Mar 2024
Spatially-Aware Transformer for Embodied Agents
Junmo Cho
Jaesik Yoon
Sungjin Ahn
358
3
0
23 Feb 2024
Do Transformer World Models Give Better Policy Gradients?
Michel Ma
Tianwei Ni
Clement Gehring
P. DÓro
Pierre-Luc Bacon
302
7
0
07 Feb 2024
Policy Optimization with Smooth Guidance Learned from State-Only Demonstrations
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Zhiming Zheng
470
0
0
30 Dec 2023
Episodic Return Decomposition by Difference of Implicitly Assigned Sub-Trajectory Reward
Hao-Chu Lin
Hongqiu Wu
Jiaji Zhang
Yihao Sun
Junyin Ye
Yang Yu
222
3
0
17 Dec 2023
Neuro-Inspired Fragmentation and Recall to Overcome Catastrophic Forgetting in Curiosity
Jaedong Hwang
Zhang-Wei Hong
Eric Chen
Akhilan Boopathy
Pulkit Agrawal
Ila Fiete
CLL
204
5
0
26 Oct 2023
PCGPT: Procedural Content Generation via Transformers
Sajad Mohaghegh
Mohammad Amin Ramezan Dehnavi
Golnoosh Abdollahinejad
Matin Hashemi
ViT
241
4
0
03 Oct 2023
Karma: Adaptive Video Streaming via Causal Sequence Modeling
ACM Multimedia (ACM MM), 2023
Bo Xu
Hao Chen
Zhanghui Ma
CML
87
8
0
20 Aug 2023
Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning
Akash Velu
Skanda Vaidyanath
Dilip Arumugam
OffRL
341
4
0
21 Jul 2023
Transformers in Reinforcement Learning: A Survey
Pranav Agarwal
A. Rahman
P. St-Charles
Simon J. D. Prince
Samira Ebrahimi Kahou
OffRL
293
28
0
12 Jul 2023
Grid Cell-Inspired Fragmentation and Recall for Efficient Map Building
Jaedong Hwang
Zhang-Wei Hong
Eric Chen
Akhilan Boopathy
Pulkit Agrawal
Ila Fiete
295
3
0
11 Jul 2023
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Neural Information Processing Systems (NeurIPS), 2023
Tianwei Ni
Michel Ma
Benjamin Eysenbach
Pierre-Luc Bacon
OffRL
587
62
0
07 Jul 2023
Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis
Neural Information Processing Systems (NeurIPS), 2023
Alexander Meulemans
Simon Schug
Seijin Kobayashi
Nathaniel D. Daw
Gregory Wayne
416
7
0
29 Jun 2023
Decision S4: Efficient Sequence-Based RL via State Spaces Layers
International Conference on Learning Representations (ICLR), 2023
Shmuel Bar-David
Itamar Zimerman
Eliya Nachmani
Lior Wolf
OffRL
228
32
0
08 Jun 2023
Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL
Neural Information Processing Systems (NeurIPS), 2022
Chen Sun
Wannan Yang
Thomas Jiralerspong
Dane Malenfant
Benjamin Alsbury-Nealy
Yoshua Bengio
Blake A. Richards
OffRL
397
3
0
12 Oct 2022
Reward Learning using Structural Motifs in Inverse Reinforcement Learning
Raeid Saqur
315
2
0
25 Sep 2022
Multi-Agent Reinforcement Learning for Long-Term Network Resource Allocation through Auction: a V2X Application
Computer Communications (Comput. Commun.), 2022
Jing Tan
R. Khalili
Holger Karl
A. Hecker
OffRL
176
5
0
29 Jul 2022
Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes
Neural Information Processing Systems (NeurIPS), 2022
A. Sorokin
N. Buzun
Leonid Pugachev
Andrey Kravchenko
397
12
0
27 Jul 2022
Off-Beat Multi-Agent Reinforcement Learning
Adaptive Agents and Multi-Agent Systems (AAMAS), 2022
Wei Qiu
Weixun Wang
Rongpin Wang
Bo An
Yujing Hu
S. Obraztsova
Zinovi Rabinovich
Jianye Hao
Yingfeng Chen
Changjie Fan
OffRL
230
2
0
27 May 2022
Modeling Human Behavior Part I -- Learning and Belief Approaches
Andrew Fuchs
A. Passarella
M. Conti
276
8
0
13 May 2022
Learning to Bid Long-Term: Multi-Agent Reinforcement Learning with Long-Term and Sparse Reward in Repeated Auction Games
Jing Tan
R. Khalili
Holger Karl
111
3
0
05 Apr 2022
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Alexis Jacq
Johan Ferret
Olivier Pietquin
Matthieu Geist
217
11
0
16 Mar 2022
Selective Credit Assignment
Veronica Chelu
Diana Borsa
Doina Precup
Hado van Hasselt
221
3
0
20 Feb 2022
Bayesian sense of time in biological and artificial brains
Zafeirios Fountas
Alexey Zakharov
237
1
0
14 Jan 2022
Learning Reward Machines: A Study in Partially Observable Reinforcement Learning
Rodrigo Toro Icarte
Ethan Waldie
Toryn Q. Klassen
Richard Valenzano
Margarita P. Castro
Sheila A. McIlraith
206
18
0
17 Dec 2021
Episodic Policy Gradient Training
Hung Le
Majid Abdolshah
Thommen George Karimpanal
Kien Do
D. Nguyen
Svetha Venkatesh
BDL
OffRL
214
7
0
03 Dec 2021
Model-Based Episodic Memory Induces Dynamic Hybrid Controls
Neural Information Processing Systems (NeurIPS), 2021
Hung Le
Thommen George Karimpanal
Majid Abdolshah
T. Tran
Svetha Venkatesh
212
24
0
03 Nov 2021
Biological learning in key-value memory networks
Danil Tyulmankov
Ching Fang
Annapurna Vadaparty
G. R. Yang
271
36
0
26 Oct 2021
Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs
International Conference on Machine Learning (ICML), 2021
Tianwei Ni
Benjamin Eysenbach
Ruslan Salakhutdinov
456
155
0
11 Oct 2021
Evaluating the progress of Deep Reinforcement Learning in the real world: aligning domain-agnostic and domain-specific research
J. Luis
E. Crawley
B. Cameron
OffRL
303
6
0
07 Jul 2021
Preferential Temporal Difference Learning
International Conference on Machine Learning (ICML), 2021
N. Anand
Doina Precup
OOD
194
9
0
11 Jun 2021
Towards Practical Credit Assignment for Deep Reinforcement Learning
Vyacheslav Alipov
Riley Simmons-Edler
N.Yu. Putintsev
Pavel Kalinin
Dmitry Vetrov
OffRL
181
13
0
08 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Neural Information Processing Systems (NeurIPS), 2021
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
708
2,154
0
02 Jun 2021
Towards mental time travel: a hierarchical memory for reinforcement learning agents
Neural Information Processing Systems (NeurIPS), 2021
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Andrea Banino
Felix Hill
406
61
0
28 May 2021
An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning
Dilip Arumugam
Peter Henderson
Pierre-Luc Bacon
176
20
0
10 Mar 2021
Synthetic Returns for Long-Term Credit Assignment
David Raposo
Samuel Ritter
Adam Santoro
Greg Wayne
T. Weber
M. Botvinick
H. V. Hasselt
Francis Song
AI4TS
244
36
0
24 Feb 2021
Delayed Rewards Calibration via Reward Empirical Sufficiency
Yixuan Liu
Hu Wang
Xiaowei Wang
Xiaoyue Sun
Liuyue Jiang
Minhui Xue
226
0
0
21 Feb 2021
1
2
Next
Page 1 of 2