Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.09801
Cited By
Meta-Gradient Reinforcement Learning
24 May 2018
Zhongwen Xu
H. V. Hasselt
David Silver
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Meta-Gradient Reinforcement Learning"
50 / 203 papers shown
Title
Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
BDL
34
0
0
12 May 2025
Scalable Meta-Learning via Mixed-Mode Differentiation
Iurii Kemaev
Dan A Calian
Luisa M Zintgraf
Gregory Farquhar
H. V. Hasselt
57
0
0
01 May 2025
Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks
Luise Ge
Michael Lanier
Anindya Sarkar
Bengisu Guresti
Yevgeniy Vorobeychik
Chongjie Zhang
47
0
0
26 Feb 2025
Discovering Quality-Diversity Algorithms via Meta-Black-Box Optimization
Maxence Faldor
Robert Tjarko Lange
Antoine Cully
81
0
0
04 Feb 2025
Reinforcement Teaching
Alex Lewandowski
Calarina Muslimani
Dale Schuurmans
Matthew E. Taylor
Jun Luo
81
1
0
28 Jan 2025
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
47
16
0
28 Jan 2025
Imitation Learning from Suboptimal Demonstrations via Meta-Learning An Action Ranker
Jiangdong Fan
Hongcai He
Paul Weng
Hui Xu
Jie Shao
34
1
0
31 Dec 2024
Segmenting Action-Value Functions Over Time-Scales in SARSA via TD(
Δ
\Delta
Δ
)
Mahammad Humayoo
66
0
0
22 Nov 2024
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Zhi Wang
Li Zhang
Wenhao Wu
Yuanheng Zhu
Dongbin Zhao
C. L. Philip Chen
OffRL
39
6
0
15 Oct 2024
Lifelong Reinforcement Learning via Neuromodulation
Sebastian Lee
Samuel Liebana Garcia
Claudia Clopath
Will Dabney
49
0
0
15 Aug 2024
Meta SAC-Lag: Towards Deployable Safe Reinforcement Learning via MetaGradient-based Hyperparameter Tuning
Homayoun Honari
Amir M. Soufi Enayati
Mehran Ghafarian Tamizi
H. Najjaran
53
1
0
15 Aug 2024
Black box meta-learning intrinsic rewards for sparse-reward environments
Octavio Pappalardo
Rodrigo Ramele
Juan Miguel Santos
OffRL
46
0
0
31 Jul 2024
Meta-Gradient Search Control: A Method for Improving the Efficiency of Dyna-style Planning
Bradley Burega
John D. Martin
Luke Kapeluck
Michael Bowling
40
0
0
27 Jun 2024
Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
Aditya A. Ramesh
Kenny Young
Louis Kirsch
Jürgen Schmidhuber
26
1
0
06 May 2024
Next Generation Loss Function for Image Classification
S. Akhmedova
Nils Körber
VLM
32
2
0
19 Apr 2024
Imitating Cost-Constrained Behaviors in Reinforcement Learning
Qian Shao
Pradeep Varakantham
Shih-Fen Cheng
25
1
0
26 Mar 2024
In-context Exploration-Exploitation for Reinforcement Learning
Zhenwen Dai
Federico Tomasi
Sina Ghiassian
OffRL
OnRL
43
3
0
11 Mar 2024
Discovering Temporally-Aware Reinforcement Learning Algorithms
Matthew Jackson
Chris Xiaoxuan Lu
Louis Kirsch
R. T. Lange
Shimon Whiteson
Jakob N. Foerster
27
18
0
08 Feb 2024
MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters
Arsalan Sharifnassab
Saber Salehkaleybar
Richard Sutton
35
3
0
04 Feb 2024
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Marius-Constantin Dinu
Claudiu Leoveanu-Condrei
Markus Holzleitner
Werner Zellinger
Sepp Hochreiter
43
10
0
01 Feb 2024
Step-size Optimization for Continual Learning
T. Degris
Khurram Javed
Arsalan Sharifnassab
Yuxin Liu
Richard Sutton
21
2
0
30 Jan 2024
Policy Optimization with Smooth Guidance Learned from State-Only Demonstrations
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Zhiming Zheng
38
0
0
30 Dec 2023
Episodic Return Decomposition by Difference of Implicitly Assigned Sub-Trajectory Reward
Hao-Chu Lin
Hongqiu Wu
Jiaji Zhang
Yihao Sun
Junyin Ye
Yang Yu
24
2
0
17 Dec 2023
Context Shift Reduction for Offline Meta-Reinforcement Learning
Yunkai Gao
Rui Zhang
Jiaming Guo
Fan Wu
Qi Yi
...
Zidong Du
Xingui Hu
Qi Guo
Ling Li
Yunji Chen
OffRL
33
14
0
07 Nov 2023
Behavior Alignment via Reward Function Optimization
Dhawal Gupta
Yash Chandak
Scott M. Jordan
Philip S. Thomas
Bruno Castro da Silva
31
10
0
29 Oct 2023
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable Meta-Optimization for Knowledge Distillation
Li Ding
M. Zoghi
Guy Tennenholtz
Maryam Karimzadehgan
26
0
0
29 Oct 2023
Imitation Learning from Observation with Automatic Discount Scheduling
Yuyang Liu
Weijun Dong
Yingdong Hu
Chuan Wen
Zhao-Heng Yin
Chongjie Zhang
Yang Gao
30
6
0
11 Oct 2023
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design
Matthew Jackson
Minqi Jiang
Jack Parker-Holder
Risto Vuorio
Chris Xiaoxuan Lu
Gregory Farquhar
Shimon Whiteson
Jakob N. Foerster
OOD
16
9
0
04 Oct 2023
AdaptNet: Policy Adaptation for Physics-Based Character Control
Pei Xu
Kaixiang Xie
Sheldon Andrews
P. Kry
Michael Neff
Morgan McGuire
Ioannis Karamouzas
Victor Zordan
TTA
37
17
0
30 Sep 2023
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov
P. DÓro
Shagun Sodhani
Roberta Raileanu
Pierre-Luc Bacon
Pascal Vincent
Amy Zhang
Mikael Henaff
LRM
LLMAG
29
55
0
29 Sep 2023
ODE-based Recurrent Model-free Reinforcement Learning for POMDPs
Xu Zhao
Duzhen Zhang
Liyuan Han
Tielin Zhang
Bo Xu
37
7
0
25 Sep 2023
Diagnosing and exploiting the computational demands of videos games for deep reinforcement learning
L. Govindarajan
Rex G Liu
Drew Linsley
A. Ashok
Max Reuter
M. Frank
Thomas Serre
OffRL
21
0
0
22 Sep 2023
Causal Reinforcement Learning: A Survey
Zhi-Hong Deng
Jing Jiang
Guodong Long
Chen Zhang
CML
LRM
50
13
0
04 Jul 2023
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
Shichang Zhang
Atefeh Sohrabizadeh
Cheng Wan
Zijie Huang
Ziniu Hu
Yewen Wang
Yingyan Lin
Lin
Jason Cong
Yizhou Sun
GNN
AI4CE
34
23
0
24 Jun 2023
Acceleration in Policy Optimization
Veronica Chelu
Tom Zahavy
A. Guez
Doina Precup
Sebastian Flennerhag
48
0
0
18 Jun 2023
Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes
Luca Sabbioni
Francesco Corda
Marcello Restelli
24
0
0
13 Jun 2023
Hyperparameters in Reinforcement Learning and How To Tune Them
Theresa Eimer
Marius Lindauer
Roberta Raileanu
OffRL
29
35
0
02 Jun 2023
On the Value of Myopic Behavior in Policy Reuse
Kang Xu
Chenjia Bai
Shuang Qiu
Haoran He
Bin Zhao
Zhen Wang
Wei Li
Xuelong Li
32
1
0
28 May 2023
Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
R. T. Lange
Tom Schaul
Yutian Chen
Chris Xiaoxuan Lu
Tom Zahavy
Valentin Dalibard
Sebastian Flennerhag
35
34
0
08 Apr 2023
IQ-Flow: Mechanism Design for Inducing Cooperative Behavior to Self-Interested Agents in Sequential Social Dilemmas
Bengisu Guresti
Abdullah Vanlioglu
N. K. Üre
18
5
0
28 Feb 2023
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Theodore H. Moskovitz
Brendan O'Donoghue
Vivek Veeriah
Sebastian Flennerhag
Satinder Singh
Tom Zahavy
47
19
0
02 Feb 2023
A Survey of Meta-Reinforcement Learning
Jacob Beck
Risto Vuorio
E. Liu
Zheng Xiong
L. Zintgraf
Chelsea Finn
Shimon Whiteson
OOD
OffRL
37
122
0
19 Jan 2023
Human-Timescale Adaptation in an Open-Ended Task Space
Adaptive Agent Team
Jakob Bauer
Kate Baumli
Satinder Baveja
Feryal M. P. Behbahani
...
Jakub Sygnowski
K. Tuyls
Sarah York
Alexander Zacherl
Lei Zhang
LM&Ro
OffRL
AI4CE
LRM
38
109
0
18 Jan 2023
Optimistic Meta-Gradients
Sebastian Flennerhag
Tom Zahavy
Brendan O'Donoghue
Hado van Hasselt
András Gyorgy
Satinder Singh
39
3
0
09 Jan 2023
POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Khimya Khetarpal
Claire Vernade
Brendan O'Donoghue
Satinder Singh
Tom Zahavy
OffRL
31
0
0
30 Dec 2022
Reusable Options through Gradient-based Meta Learning
David Kuric
H. V. Hoof
34
0
0
22 Dec 2022
General-Purpose In-Context Learning by Meta-Learning Transformers
Louis Kirsch
James Harrison
Jascha Narain Sohl-Dickstein
Luke Metz
40
72
0
08 Dec 2022
Hypernetworks for Zero-shot Transfer in Reinforcement Learning
S. Rezaei-Shoshtari
Charlotte Morissette
F. Hogan
Gregory Dudek
D. Meger
OffRL
17
14
0
28 Nov 2022
Discovering Evolution Strategies via Meta-Black-Box Optimization
R. T. Lange
Tom Schaul
Yutian Chen
Tom Zahavy
Valenti Dallibard
Chris Xiaoxuan Lu
Satinder Singh
Sebastian Flennerhag
44
47
0
21 Nov 2022
Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function
Clément Bonnet
Laurence Midgley
Alexandre Laterre
26
1
0
19 Nov 2022
1
2
3
4
5
Next