ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.09801
  4. Cited By
Meta-Gradient Reinforcement Learning

Meta-Gradient Reinforcement Learning

24 May 2018
Zhongwen Xu
H. V. Hasselt
David Silver
ArXivPDFHTML

Papers citing "Meta-Gradient Reinforcement Learning"

50 / 203 papers shown
Title
Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains
Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
BDL
34
0
0
12 May 2025
Scalable Meta-Learning via Mixed-Mode Differentiation
Scalable Meta-Learning via Mixed-Mode Differentiation
Iurii Kemaev
Dan A Calian
Luisa M Zintgraf
Gregory Farquhar
H. V. Hasselt
57
0
0
01 May 2025
Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks
Luise Ge
Michael Lanier
Anindya Sarkar
Bengisu Guresti
Yevgeniy Vorobeychik
Chongjie Zhang
47
0
0
26 Feb 2025
Discovering Quality-Diversity Algorithms via Meta-Black-Box Optimization
Discovering Quality-Diversity Algorithms via Meta-Black-Box Optimization
Maxence Faldor
Robert Tjarko Lange
Antoine Cully
81
0
0
04 Feb 2025
Reinforcement Teaching
Reinforcement Teaching
Alex Lewandowski
Calarina Muslimani
Dale Schuurmans
Matthew E. Taylor
Jun Luo
81
1
0
28 Jan 2025
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
47
16
0
28 Jan 2025
Imitation Learning from Suboptimal Demonstrations via Meta-Learning An Action Ranker
Imitation Learning from Suboptimal Demonstrations via Meta-Learning An Action Ranker
Jiangdong Fan
Hongcai He
Paul Weng
Hui Xu
Jie Shao
34
1
0
31 Dec 2024
Segmenting Action-Value Functions Over Time-Scales in SARSA via TD($\Delta$)
Segmenting Action-Value Functions Over Time-Scales in SARSA via TD(Δ\DeltaΔ)
Mahammad Humayoo
66
0
0
22 Nov 2024
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World
  Model Disentanglement
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Zhi Wang
Li Zhang
Wenhao Wu
Yuanheng Zhu
Dongbin Zhao
C. L. Philip Chen
OffRL
39
6
0
15 Oct 2024
Lifelong Reinforcement Learning via Neuromodulation
Lifelong Reinforcement Learning via Neuromodulation
Sebastian Lee
Samuel Liebana Garcia
Claudia Clopath
Will Dabney
49
0
0
15 Aug 2024
Meta SAC-Lag: Towards Deployable Safe Reinforcement Learning via
  MetaGradient-based Hyperparameter Tuning
Meta SAC-Lag: Towards Deployable Safe Reinforcement Learning via MetaGradient-based Hyperparameter Tuning
Homayoun Honari
Amir M. Soufi Enayati
Mehran Ghafarian Tamizi
H. Najjaran
53
1
0
15 Aug 2024
Black box meta-learning intrinsic rewards for sparse-reward environments
Black box meta-learning intrinsic rewards for sparse-reward environments
Octavio Pappalardo
Rodrigo Ramele
Juan Miguel Santos
OffRL
46
0
0
31 Jul 2024
Meta-Gradient Search Control: A Method for Improving the Efficiency of
  Dyna-style Planning
Meta-Gradient Search Control: A Method for Improving the Efficiency of Dyna-style Planning
Bradley Burega
John D. Martin
Luke Kapeluck
Michael Bowling
40
0
0
27 Jun 2024
Sequence Compression Speeds Up Credit Assignment in Reinforcement
  Learning
Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
Aditya A. Ramesh
Kenny Young
Louis Kirsch
Jürgen Schmidhuber
26
1
0
06 May 2024
Next Generation Loss Function for Image Classification
Next Generation Loss Function for Image Classification
S. Akhmedova
Nils Körber
VLM
32
2
0
19 Apr 2024
Imitating Cost-Constrained Behaviors in Reinforcement Learning
Imitating Cost-Constrained Behaviors in Reinforcement Learning
Qian Shao
Pradeep Varakantham
Shih-Fen Cheng
25
1
0
26 Mar 2024
In-context Exploration-Exploitation for Reinforcement Learning
In-context Exploration-Exploitation for Reinforcement Learning
Zhenwen Dai
Federico Tomasi
Sina Ghiassian
OffRL
OnRL
43
3
0
11 Mar 2024
Discovering Temporally-Aware Reinforcement Learning Algorithms
Discovering Temporally-Aware Reinforcement Learning Algorithms
Matthew Jackson
Chris Xiaoxuan Lu
Louis Kirsch
R. T. Lange
Shimon Whiteson
Jakob N. Foerster
27
18
0
08 Feb 2024
MetaOptimize: A Framework for Optimizing Step Sizes and Other
  Meta-parameters
MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters
Arsalan Sharifnassab
Saber Salehkaleybar
Richard Sutton
35
3
0
04 Feb 2024
SymbolicAI: A framework for logic-based approaches combining generative
  models and solvers
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Marius-Constantin Dinu
Claudiu Leoveanu-Condrei
Markus Holzleitner
Werner Zellinger
Sepp Hochreiter
43
10
0
01 Feb 2024
Step-size Optimization for Continual Learning
Step-size Optimization for Continual Learning
T. Degris
Khurram Javed
Arsalan Sharifnassab
Yuxin Liu
Richard Sutton
21
2
0
30 Jan 2024
Policy Optimization with Smooth Guidance Learned from State-Only
  Demonstrations
Policy Optimization with Smooth Guidance Learned from State-Only Demonstrations
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Zhiming Zheng
38
0
0
30 Dec 2023
Episodic Return Decomposition by Difference of Implicitly Assigned
  Sub-Trajectory Reward
Episodic Return Decomposition by Difference of Implicitly Assigned Sub-Trajectory Reward
Hao-Chu Lin
Hongqiu Wu
Jiaji Zhang
Yihao Sun
Junyin Ye
Yang Yu
24
2
0
17 Dec 2023
Context Shift Reduction for Offline Meta-Reinforcement Learning
Context Shift Reduction for Offline Meta-Reinforcement Learning
Yunkai Gao
Rui Zhang
Jiaming Guo
Fan Wu
Qi Yi
...
Zidong Du
Xingui Hu
Qi Guo
Ling Li
Yunji Chen
OffRL
33
14
0
07 Nov 2023
Behavior Alignment via Reward Function Optimization
Behavior Alignment via Reward Function Optimization
Dhawal Gupta
Yash Chandak
Scott M. Jordan
Philip S. Thomas
Bruno Castro da Silva
31
10
0
29 Oct 2023
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable
  Meta-Optimization for Knowledge Distillation
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable Meta-Optimization for Knowledge Distillation
Li Ding
M. Zoghi
Guy Tennenholtz
Maryam Karimzadehgan
26
0
0
29 Oct 2023
Imitation Learning from Observation with Automatic Discount Scheduling
Imitation Learning from Observation with Automatic Discount Scheduling
Yuyang Liu
Weijun Dong
Yingdong Hu
Chuan Wen
Zhao-Heng Yin
Chongjie Zhang
Yang Gao
30
6
0
11 Oct 2023
Discovering General Reinforcement Learning Algorithms with Adversarial
  Environment Design
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design
Matthew Jackson
Minqi Jiang
Jack Parker-Holder
Risto Vuorio
Chris Xiaoxuan Lu
Gregory Farquhar
Shimon Whiteson
Jakob N. Foerster
OOD
16
9
0
04 Oct 2023
AdaptNet: Policy Adaptation for Physics-Based Character Control
AdaptNet: Policy Adaptation for Physics-Based Character Control
Pei Xu
Kaixiang Xie
Sheldon Andrews
P. Kry
Michael Neff
Morgan McGuire
Ioannis Karamouzas
Victor Zordan
TTA
37
17
0
30 Sep 2023
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov
P. DÓro
Shagun Sodhani
Roberta Raileanu
Pierre-Luc Bacon
Pascal Vincent
Amy Zhang
Mikael Henaff
LRM
LLMAG
29
55
0
29 Sep 2023
ODE-based Recurrent Model-free Reinforcement Learning for POMDPs
ODE-based Recurrent Model-free Reinforcement Learning for POMDPs
Xu Zhao
Duzhen Zhang
Liyuan Han
Tielin Zhang
Bo Xu
37
7
0
25 Sep 2023
Diagnosing and exploiting the computational demands of videos games for
  deep reinforcement learning
Diagnosing and exploiting the computational demands of videos games for deep reinforcement learning
L. Govindarajan
Rex G Liu
Drew Linsley
A. Ashok
Max Reuter
M. Frank
Thomas Serre
OffRL
21
0
0
22 Sep 2023
Causal Reinforcement Learning: A Survey
Causal Reinforcement Learning: A Survey
Zhi-Hong Deng
Jing Jiang
Guodong Long
Chen Zhang
CML
LRM
50
13
0
04 Jul 2023
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and
  Customized Hardware
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
Shichang Zhang
Atefeh Sohrabizadeh
Cheng Wan
Zijie Huang
Ziniu Hu
Yewen Wang
Yingyan Lin
Lin
Jason Cong
Yizhou Sun
GNN
AI4CE
34
23
0
24 Jun 2023
Acceleration in Policy Optimization
Acceleration in Policy Optimization
Veronica Chelu
Tom Zahavy
A. Guez
Doina Precup
Sebastian Flennerhag
48
0
0
18 Jun 2023
Stepsize Learning for Policy Gradient Methods in Contextual Markov
  Decision Processes
Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes
Luca Sabbioni
Francesco Corda
Marcello Restelli
24
0
0
13 Jun 2023
Hyperparameters in Reinforcement Learning and How To Tune Them
Hyperparameters in Reinforcement Learning and How To Tune Them
Theresa Eimer
Marius Lindauer
Roberta Raileanu
OffRL
29
35
0
02 Jun 2023
On the Value of Myopic Behavior in Policy Reuse
On the Value of Myopic Behavior in Policy Reuse
Kang Xu
Chenjia Bai
Shuang Qiu
Haoran He
Bin Zhao
Zhen Wang
Wei Li
Xuelong Li
32
1
0
28 May 2023
Discovering Attention-Based Genetic Algorithms via Meta-Black-Box
  Optimization
Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
R. T. Lange
Tom Schaul
Yutian Chen
Chris Xiaoxuan Lu
Tom Zahavy
Valentin Dalibard
Sebastian Flennerhag
35
34
0
08 Apr 2023
IQ-Flow: Mechanism Design for Inducing Cooperative Behavior to
  Self-Interested Agents in Sequential Social Dilemmas
IQ-Flow: Mechanism Design for Inducing Cooperative Behavior to Self-Interested Agents in Sequential Social Dilemmas
Bengisu Guresti
Abdullah Vanlioglu
N. K. Üre
18
5
0
28 Feb 2023
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for
  Last-Iterate Convergence in Constrained MDPs
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Theodore H. Moskovitz
Brendan O'Donoghue
Vivek Veeriah
Sebastian Flennerhag
Satinder Singh
Tom Zahavy
47
19
0
02 Feb 2023
A Survey of Meta-Reinforcement Learning
A Survey of Meta-Reinforcement Learning
Jacob Beck
Risto Vuorio
E. Liu
Zheng Xiong
L. Zintgraf
Chelsea Finn
Shimon Whiteson
OOD
OffRL
37
122
0
19 Jan 2023
Human-Timescale Adaptation in an Open-Ended Task Space
Human-Timescale Adaptation in an Open-Ended Task Space
Adaptive Agent Team
Jakob Bauer
Kate Baumli
Satinder Baveja
Feryal M. P. Behbahani
...
Jakub Sygnowski
K. Tuyls
Sarah York
Alexander Zacherl
Lei Zhang
LM&Ro
OffRL
AI4CE
LRM
38
109
0
18 Jan 2023
Optimistic Meta-Gradients
Optimistic Meta-Gradients
Sebastian Flennerhag
Tom Zahavy
Brendan O'Donoghue
Hado van Hasselt
András Gyorgy
Satinder Singh
39
3
0
09 Jan 2023
POMRL: No-Regret Learning-to-Plan with Increasing Horizons
POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Khimya Khetarpal
Claire Vernade
Brendan O'Donoghue
Satinder Singh
Tom Zahavy
OffRL
31
0
0
30 Dec 2022
Reusable Options through Gradient-based Meta Learning
Reusable Options through Gradient-based Meta Learning
David Kuric
H. V. Hoof
34
0
0
22 Dec 2022
General-Purpose In-Context Learning by Meta-Learning Transformers
General-Purpose In-Context Learning by Meta-Learning Transformers
Louis Kirsch
James Harrison
Jascha Narain Sohl-Dickstein
Luke Metz
40
72
0
08 Dec 2022
Hypernetworks for Zero-shot Transfer in Reinforcement Learning
Hypernetworks for Zero-shot Transfer in Reinforcement Learning
S. Rezaei-Shoshtari
Charlotte Morissette
F. Hogan
Gregory Dudek
D. Meger
OffRL
17
14
0
28 Nov 2022
Discovering Evolution Strategies via Meta-Black-Box Optimization
Discovering Evolution Strategies via Meta-Black-Box Optimization
R. T. Lange
Tom Schaul
Yutian Chen
Tom Zahavy
Valenti Dallibard
Chris Xiaoxuan Lu
Satinder Singh
Sebastian Flennerhag
44
47
0
21 Nov 2022
Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer
  Value Function
Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function
Clément Bonnet
Laurence Midgley
Alexandre Laterre
26
1
0
19 Nov 2022
12345
Next